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Introduction 


This book is an elaboration of a course given by the author at Moscow 
University for pupils in the ninth and tenth grades. In it we discuss the 
development through abstraction of the general definition of distance 
and introduce a class of spaces in which the notion of distance is 
defined, the so-called metric spaces. It will be evident from our dis- 
cussion that the general concept of distance is related to a large number 
of mathematical phenomena. 

With the aid of the concept of distance, it is possible to study problems 
concerning the “shortest”? path between two points on a surface, the 
geometric properties of multidimensional spaces, methods of “noise” 
reduction in the coding of information, and methods of “smoothing” 
errors in the results of empirical measurements, as well as many other 
such topics. 

The concept of “distance,” moreover, is a good illustration of the 
role played in mathematics by the generalization of specific ideas, the 
results of which at times find some rather unexpected applications. 
Other good examples of such generalizations which have been found 
indispensable to many areas of mathematics may also be cited: the 
notions of function, limit, space, and transformation, as well as the less 
familiar concepts of isomorphism, group, ring, and so on. Of these 
examples, however, the concept of distance seems most suited to the 
type of elementary discussion required by the inexperience of our 
audience, a consideration which is the chief motivation for our choice 
of this particular topic. Our aim is to demonstrate by means accessible 
to a wide range of readers the way in which one fruitful idea can shed 
light on a wide variety of mathematical questions and, at the same time, 
serve as a source of new results and insight in some particular field of 
knowledge. This situation, characteristic of all of the sciences, appears 
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quite often in mathematics in particularly striking ways, making 
possible a clear understanding without the necessity of mastering a 
myriad of confusing details. The material for this book has been chosen 
with this general idea in mind. 

The first four chapters are intended to expose the reader to the 
generalization of the ordinary geometric definition of distance and to 
the illustration of the generalized concept via concrete situations. 
Chapter 5 describes the so-called space of information, a concept that 
plays a major role in the theory of information and the general theory 
of communication. Chapter 6 deals with methods of coding information 
which allow that information to be relatively unaffected by errors in the 
process of transmission. Since in all real communications devices, 
errors occur in a number of ways, such methods of coding are essential 
for modern systems of communication and control. For example, in the 
transmission of photographs from the far side of the moon by a Soviet 
space vehicle, error-reducing methods of codification had to be used. 
lt is important to note that each of these methods involves the use of 
the generalized concept of distance in the space of information. 

The material in chapter 7 is somewhat more complicated; there we 
deal with an important class of spaces to which the notion of distance is 
common. Chapter 8 describes the application of the generalized concept 
of distance to the problem of “smoothing” errors in the results of 
empirical measurements—the problem of finding a mathematical 
process which will nearly eliminate the effect of error in experimental 
data. This chapter is essentially an exposition of the method of least 
squares. Some knowledge of differential calculus is necessary for an 
understanding of this chapter. The reader who has not had the necessary 
background may omit this section. 

In the final chapter, the possibility of further generalization of the 
concept of distance is examined. In this chapter I wish primarily to show 
that it is not necessarily true that all generalizations possess interesting 
properties. It is not always easy to develop a good generalization of a 
mathematical concept. At the core of any worthwhile generalization are 
some essential properties of the real world. In particular, the concept of 
distance is important because many essential properties of real objects 
are related to their mutual disposition, which can frequently be char- 
acterized by a properly defined concept of distance. For example, 
although it is impossible to describe the electrons of an atom as point 
masses, quantum mechanics is nevertheless able to determine the 
“distance” between the two energy states of electrons. This “distance” 
is related conceptually to the “distance” defined in the so-called /, 
space discussed in chapter 7. 
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I shall consider my task complete if this book is able to give the reader 
a satisfactory understanding of the ideas mentioned above. 

I wish to take this opportunity to express my gratitude to I. M. 
Yaglom, who has provided much valuable advice concerning the 
improvement of this manuscript. 


l The Definition 
of Mathematical 
Concepts 


At first glance, the title of this book may seem surprising. Every 
schoolboy, it would seem, knows what distance is. Even a person who 
has completely forgotten his high-school geometry and who cannot 
accurately formulate a definition of distance would be quick to assert 
that he knows very well what distance is. 

But, in fact, the matter is much more complicated. 

The word distance can take on different meanings depending upon 
what particular space one is talking about. We are about to see that this 
is true even in situations with which we are well acquainted. 

In the Euclidean plane and in ordinary 
three-dimensional Euclidean space, the 
distance between two points M and N 
is defined as the length of the line 
segment MN joining those points. 
When dealing with distances between 
geographical loci on the surface of the 
earth, however, we usually have in mind 
the length of the smaller arc of the great 
circle joining those localities. The dif- 
ference between these two meanings of 

Fig. 1.1 distance becomes particularly noticeable 

if we calculate the distance between the 

north pole N and the south pole S (see fig. 1.1). The ordinary (Euclidean) 

distance between the poles is equal to the diameter of the earth, 

approximately 8,000 miles. The distance between the poles along the 

surface of the earth is, however, greater than this by a factor of 7/2; it 
is about 12,500 miles. 

To this example one might add that, in commerce, even the means of the 
transportation to be used must be taken into account in the estimation 
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of distances between cities. For example, the distance between two points 
by car may differ from the distance by train. 

We can obtain another example of distance if we consider points in 
rugged terrain and define the distance between two such points as the 
time necessary for someone on foot to travel from one point to another. 

It is clear that this distance has nothing in common with the length of 
the line segment joining two points, for the straight line, in general, is 
not the best or most possible path. Indeed, a foot traveler will calculate 
the distance between two points by the time he spends in travel between 
them. 

Despite differences among these means of measurement, however, it 
is evident that all meanings taken on by the word distance have some- 
thing in common. A measure of “how far apart? two objects are is 
always indicated. Thus, one may suppose that there exists some common 
definition of distance which has various interpretations in various 
concrete situations. Such a general definition will be formulated in 
chapter 3. But first we shall consider what, in general, is necessary for 
the definition of a mathematical concept. 

Modern mathematics is the language of natural science. Underlying 
the most important mathematical ideas are spatial-temporal facts about 
the world in which we live. However, the relationship between these 
facts and the corresponding mathematical ideas is sometimes very 
complicated. 

In every branch of mathematics are some fundamental concepts 
which are related in our minds to certain physical images. Some of the 
fundamental properties of these concepts are formulated as axioms (or 
postulates); “truths” that are not proved but accepted as a starting 
point. All of the remaining propositions of the given branch of mathe- 
matics are derived logically from these axioms without reference to the 
properties of the physical world. The very formulation of a set of axioms 
expresses to some degree the relationship between intuitive knowledge 
of properties associated with these ideas and the empirically obvious 
properties of their physical forms. 

Some of the most important concepts involved in geometry are the 
ideas of point, straight line, plane, space, and so on. In a systematic 
geometry course it is necessary to develop a list of the most basic 
properties of these concepts in the form of a set of axioms, the basis on 
which the whole structure of geometry is built. 

Some of the principal concepts involved in algebra are those of sets of 
numbers and operations on these numbers. For example, the structure 


1. The first to fashion such an exposition of geometry was the ancient Greek 
mathematician Euclid (fourth-third century B.c.). 
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of the integers, rational numbers, algebraic numbers, real numbers, 
complex numbers, and so on, are studied. 

In each of the five number systems specifically mentioned above, one 
can verify that certain fundamental laws concerning operations on 
numbers are satisfied. These are the commutative law for addition 
(a+ b= b +a), the associative law for addition ([a + b] + c = 
a + [b + c]), the commutative law for multiplication (ab = ba), the 
associative law for multiplication ([ab]e = a[bc]), the distributive law 
({a + b]c = ac + bc), and the rules a — a = 0,a x lja = 1 fora # 0, 
which characterize the relationship between the principal operations 
(addition and multiplication) and their inverses (subtraction and divi- 
sion). All of these laws are satisfied in the number systems listed above 
to which they apply. However, it is not always the case that a given 
operation is defined in a given number system. Division is not always 
possible within the integers and, therefore, is not well-defined as an 
operation on the set of integers. If a number system contains only 
positive numbers, subtraction is not always possible. As it happens, 
certain rules for algebraic transformation of various expressions depend 
only on the properties listed above. For example, all of the rules for the 
solution of first-degree equations and systems of such equations are 
based upon these laws and upon the possibility of carrying out the 
operation of division. 

It turns out, in fact, that it is possible to study many properties of 
various number systems as consequences of the general theory of 
systems on which defined operations (called addition and multiplication) 
satisfy the properties listed above. Such systems are termed commutative 
rings or fields in modern algebra (depending on whether it is always 
possible to carry out division).? 

It is possible to view the rules for transformation of expressions and 
for solution of equations in the case of an arbitrary field or ring and to 
look at the rules normally developed in high-school algebra as special 
cases. 

In contemporary algebra, rings and fields are usually studied as 
generalizations of number systems studied in high school. The basic 
properties of operations that can be carried out for integers or for 
rational numbers are set down as a starting point, and facts that may be 
derived logically using these properties alone are studied. 

In taking this approach, mathematicians are interested not only in 
discovering new properties of the physical world and establishing 
relationships among these properties, but also in clarifying properties of 


2. For a definition of ring and field see Birkhoff and MacLane, A Survey of 
Modern Algebra (New York: Macmillan, 1965). 
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“imaginary” worlds developed by using axioms similar to those of the 
number systems most closely related to physical reality. 

This facet of mathematics is no less important than the possibility of 
describing the physical world. The Russian mathematician N. I. 
Lobachevskii, by altering one of Euclid’s postulates, created an 
“imaginary” geometry, which, long afterwards, served as the basis of 
new physical concepts of the universe arising from Einstein’s develop- 
ment of the theory of relativity. 

In this book we shall study one of the most important of mathe- 
matical concepts—the concept of distance. 

Our first attempt will be the listing of those properties of distance 
which are essential to elementary geometry. With these laws as our 
basis, we shall derive the definition of a so-called metric space and study 
various examples of such spaces. We shall see that such a specifically 
mathematical approach to the study of certain concepts from the point 
of view of a generalized concept reveals many interesting facts. 

This approach—the creation of generalized concepts and the attempt 
to describe physical realities with the aid of these concepts—is char- 
acteristic of modern mathematics and its fields of application.? From 
this point of view, the concept of distance provides a good example of 
the fruitfulness of such an approach. 


3. We must not overlook the role played in cybernetics by such generalized 
mathematical concepts as information, aufomata theory, and algorithm. 


2 Distance and 
Its Properties in 
Elementary 
Geometry 


We hope to arrive at a general definition of distance by generalizing 
the properties of “ordinary” distance in three-dimensional Euclidean 
space. Therefore, we shall first attempt to list the Fundamental properties 
of ordinary distance. 

Let us agree to denote the distance between two points M and N in 
three-dimensional space—the length of the line segment MN—as 
d(M, N). 

This notation emphasizes the fact that the distance between M and N 
is a real number which is completely determined by points M and N. In 
other words, distance is a real-valued function of pairs of points. If we 
characterize each point by an ordered triple of coordinates, say M = 
(x, y, Z) and N = (x,, y1, 21), then distance in three-space becomes a 
function of six variables: 


d(M, N) = F(x, y, 2, Xis Yis zı). 


Ny N (x1.¥1.21) 





N (x, y1) 





L 
M (xy) 


Fig. 2.2 


With the aid of figure 2.1, one can derive a closed algebraic expression 
for this Function. Pictured is a parallelepiped with sides parallel to the 
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coordinate axes. We know that the square of the length of the diagonal 
of a parallelepiped is equal to the sum of the squares of the lengths of 
its sides. Consequently, 


MN? = MM? + MM2 + MM,? 
= (x — x)? + (y — yi)? + (2-2), 


or 


AM, N) = V(x — x) + (Y — yi)? + (z-z). (2.1) 


It is even simpler to calculate the distance between the points M = (x, y) 
and N = (xı, yı) in the Euclidean plane (see fig. 2.2). For this 
calculation, we need only note that the length of the line segment ML is 
just |x — x,|, and, similarly, that the length of the line segment LN is 
|y — y,|. By the Pythagorean theorem, 


MN? = ML? + LN?, 
so that 
AM, N) = V(x — x1)? + (y — y)? (2.2) 


Despite the importance of equations (2.1) and (2.2), the properties 
of distance that we shall need can be obtained without the use of a 
coordinate system. 

These properties can be formulated as follows: 

1. dM, N) = d(N, M) (symmetry). 

2. d(M, N) = 0 (nonnegativity). 

3. d(M, N) = 0 if and only if the points M and N coincide (non- 

degeneracy). 

4. AM, N) < d(M, L) + d(L, N) for arbitrary points M, N, and L 

(the triangle inequality). 

Properties 1, 2, and 3 are obviously basic to Euclidean distance. They 
indicate simply that the length of the segment MN is equal to the 
length of the segment NM, that this length is always nonnegative, and 
that it is equal to zero if and only if the two endpoints of the segment 
coincide. 

Property 4 becomes evident if we draw the plane determined by the 
points M, L, and N (and, therefore, containing the triangle MLN) 
(fig. 2.3). Property 4 then indicates only that the length of side MN does 


Distance and Its Properties In Elementary Geometry 7 


not exceed the sum of the lengths of the remaining sides of the triangle 
(hence the name triangle inequality). In other words, the straight line 
segment MN is the shortest path joining the points M and N. 


L3 


L2 


L M 
Fig. 2.3 Fig. 2.4 


In fact, the triangle inequality becomes a strict inequality [d(M, N) < 
d(M, L) + d(L, N)] in Euclidean three-space when we introduce the 
added restriction that L does not lie on segment MN. Hence, we can 
conclude that the length of segment MN is strictly less than the length 
of a broken line consisting of an arbitrary finite number of segments 
whose union joins the points M and N. In order to justify this conclusion 
(fig. 5), we shall repeatedly decrease by one the number of segments in 
the broken line, until, finally, only two segments remain. At each step 
in this process the length of the broken line will be strictly lessened until 
we reach the segment MN itself. Thus, in figure 2.4 we go from the 
broken line ML,L2L3N to the broken line ML, LN, then to the broken 
line MLN, and finally to the segment MN. Each time the length of the 
broken segment decreases, and thus the length of the original broken 
line is strictly greater than the length of the segment MN. 


Table 2.1 


Application of the strict 
Broken line Its length triangle inequality 


ML,L2L3N d(M, Li) + d(Li, La) + d( L2, La) d( La, La) + d(L3, N) > d(Lz, N) 


+ d(Ls, N) 
ML,L2N d(M,L,) + d(Ly, La) + d(L2, N) d(M, Li) + d(L,, Lo) > d(M, Lo) 
MLN d(M, La) + d(La, N) d(M, La) + d(L2, N) > d(M, N) 


MN d(M, N) 
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Let us note that in this deduction we use only the strict triangle 
inequality for Euclidean space. This can be best illustrated by table 2.1. 
From this table it is evident how, by replacing the sums in the second 
column by lesser sums using the inequalities from the third column, we 
arrive at the conclusion that 


d(M, Li) + d(L,, Lo) + d(L;, L;) + d(L;, N) > d(M, N) . 


If, in addition, we use the fact that the length of a curve is the limit 
of the lengths of broken segments approximating the curve, it is possible 
to prove the following assertion : 

Of all the paths joining points M and N, the straight line segment MN 
has the smallest length. 

From the triangle inequality it follows that 


d(L, N) > d(M, N) — d(M,L). (2.3) 


Let us emphasize that equality holds in the triangle inequality for our 
three-dimensional example if and only if the points M, N, and L lie on 
the same straight line and L is located “between”” M and N (that is, L 
lies on the segment MN). 

Let us now examine a distance function on the surface of a sphere S 
of radius r. 

We define the distance between two points M and N on the surface of 
a sphere as the length of the smaller arc of the great circle passing 
through the points M and N. Let us recall that a circle lying on the 
surface of a sphere is called a great circle if its center coincides with the 
center of the sphere. In other words, a great circle lies on the plane 
passing through the points M, N, and O (O being the center of the 
sphere). It follows that each pair of distinct points M and N uniquely 
determines a great circle, since three distinct points uniquely determine 
a plane. The distance d;(M, N) defined in this way clearly satisfies 
properties 1, 2, and 3. It is not difficult to see further that for arbitrary 
points M and N on the sphere, 


d(M,N) < ar, (2.4) 


with equality holding only for points M and N lying at the endpoints of 
a diameter of the sphere (for example, the North and South Poles). 
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To verify the fourth distance property, 
it is necessary to examine the spherical 
triangle MLN (fig. 2.5). (The point O is 
the center of the sphere.) 

It is clear that 


dM, N) = re, dL, N) = rB, 
ds(L, M) = ry, 





where «, 8, and y are the radian measures 
of angles MON, LON, and LOM, 
respectively. 

It is well known that in such a trihedral angle none of the planar 
angles exceeds the sum of the two other planar angles; in particular, 


a<Bty. (2.5) 
Multiplying both sides of this inequality by the radius r, we obtain 





Fig. 2.5 


ra <rB + ry, 
or 


dM, N) < dí(M, L) + ds(L, N), (2.6) 


the inequality we set out to establish. 

Thus, all of the fundamental properties of ordinary distance are 
satisfied by the spherical distance ds(M, N). 

It is easy to show that equality holds in inequality (2.6) if and only if 
two conditions are satisfied: first, that the point L is located on the 
same great circle as the points M and N; and second, that L lies 
“between” M and N—on the smaller arc of the great circle determined 
by M and N. 

This follows from the fact that inequality (2.6) becomes an equality 
only when equality holds in (2.5). But this can occur only when the 
trihedral angle degenerates into a planar one—that is, when the points 
M, N, and L lie on a plane passing through the center O of the sphere 
and ray OL is located between rays OM and ON. But this implies that 
the point £ lies on the smaller arc of the great circle joining points M 
and N. 

It is evident that the smaller arc of the great circle joining points M 
and N possesses properties analogous to those of the straight line 
segment in ordinary (nonspherical Euclidean) geometry. In particular, 
(1) through a pair of arbitrary distinct points there passes exactly one 
such arc (with the exception of the case where the points M and N lie at 
the endpoints of a diameter of the sphere—that is, where they are 
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antipodal—in which case both arcs of any great circle joining M and N 
are of equal length, and there are infinitely many such circles); (2) for 
any point L lying on such an arc joining the points M and N, the equation 


ds(M, L) + ds(L, N) = ds(M, N) 


holds. 

Let us note at this point an important extension of a fact proven in 
Euclidean three-space. For ordinary distance we have shown that the 
length of any broken line joining two points M and N is greater than 
the distance between the points M and N, that is, than the length of the 
segment MN. Here we base our reasoning only on the triangle in- 
equality and on the fact that equality holds only ifthe points M, L, and 
N lie on the same segment (with L “between” M and N). Since the 
triangle inequality is also true for the distance function we have defined 
on the sphere, with the ordinary line segment corresponding here to the 
smaller arc of a great circle, it is apparent that an analogous assertion 
is true on the sphere: If the points M and N are joined by a broken 
sequence of arcs of great circles (fig. 2.6) in which successive arcs are 
joined by a common endpoint, then the total length of such a “spherical 
broken line” is greater than the distance d,(M, N).? 

We suggest that the reader write up a full proof of this assertion in 
analogy with the proof for ordinary distance carried out above. This 
assertion can easily be generalized (using limit arguments) in the 
following form: The length of the smaller arc of the great circle joining 
the points M and N is less than the length of any other path on the 
Sphere connecting these points. 

Thus, we have examined two examples of distance and determined 
that their fundamental properties are the same. Auxiliary properties 
such as (2.4) (p. 8), properties peculiar 
to the particular example, play a much 
smaller role. 

Therefore, our next step will be to 
take the fundamental properties of 
distance (1, 2, 3, and 4) as axioms and to 
study various spaces in which a distance 
satisfying these axioms is defined. In 
this chapter we have examined two 
elementary examples of such spaces: 
ordinary Euclidean three-space and the 
Fig. 2.6 surface of the sphere. 


1. We must assume here, of course, that this sequence actually is “broken”; 
that is, that it does not lie entirely on the smaller arc of the great circle joining the 
points M and N. 





3 The Definition 
of a 
Metric Space 
and of Distance 


We shall begin with an explanation of what a set is. Like the notion of 
point in geometry, the concept of set is fundamental and yet difficult to 
define. The word set is used in mathematics to indicate a collection of 
objects called elements of the set. 

The concept of set has important applications in any situation where 
a general property is assigned to certain objects. When these objects 
fall into some class according to some sort of rule, they form a set. We 
shall say that a set contains each of its elements, and that each element 
of a given set is contained in it. A set is considered specified if for any 
arbitrary object it is possible to determine whether or not it is contained 
in the given set.! 

Let us consider, for example, the set of all integers. The sun is not 
contained in this set as it is not a number but an object of an entirely 
different sort. The number ~ is not contained in this set, for it is not 
integral. On the other hand, the roots of the equation x? — 3x +2 =0 
are contained in this set. It is possible to examine the set of all planets 
of the solar system, where we define planets as bodies moving around 
the sun in a closed orbit and weighing no less than one ton. The sun is 
not contained in this set, since it does not (strictly) move around itself. 
The earth is contained in this set. The Soviet rocket launched from the 
earth into an orbit about the sun on January 2, 1959, is also contained 
in this set; it is an artificial planet. 

Let E be some set and N one of its elements. This relation is written 
symbolically as Ne E and is read “N is an element of E.” A symbolic 

1. The question of what sort of method of determination is to be considered 
“effective” is of great interest in mathematical logic and philosophy, but it will 
not concern us here. An analogous difficulty is inherent in all formal classification 


systems. As an example, we may cite the biological difficulty in defining what sort 
of anthropomorphic beings belong to the class Homo sapiens. 
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notation of the type E = {L, M, N,...} is also used, where each 
element of the set is enumerated within the brackets. Thus, the set Z,, 
consisting of all of the capitals of the Soviet republics, could be written 
symbolically as E, = {Moscow, Kiev, Minsk, Tbilisi, Yerevan, Baku, 
Riga, Tallinn, Vilnius, Tashkent, Alma-Ata, Frunze, Ashkhabad, 
Dyushambe, Kishinev}. 

If every element of a set Eis at the same time an element of a set E, 
the set Eis called a subset of the set E,. This is written as E < E, (“Eis 
contained in E,””). For example, the set of all integers is a subset of the 
set of all real numbers. 

A set E is called finite if each of its elements can be associated with 
(mapped to) a different element of some set of the form E, = {l, 2, 
3,..., m}. In other words, for a set E to be finite, there must exist a 
function F from E to E, such that for each pair of elements a and b in 
E, F(a) = F(b) implies a = b. For example, the set E, of capitals of the 
Soviet republics is finite, since it is possible to enumerate this set using 
the elements of the set Es, as is evident from table 3.1. 


Table 3.1 


Moscow 1 Yerevan 5 Vilnius 9 Ashkhabad 13 
Kiev 2 Baku 6 Tashkent 10 Dyushambe 14 
Minsk 3 Riga 7 Alma-Ata 11 Kishinev 15 
Tbilisi 4 Tallinn 8 Frunze 12 


We are now in a position to give a definition of a metric space. 

A metric space (E, d) is a set Ein which for each pair of elements M 
and N a real number d(M, N) is defined and the following properties 
are Satisfied: 

1. dM, N) = d(N, M) (symmetry). 

2. d(M, N) = O (nonnegativity). 

3. d(M, N) = 0 if and only if M and N are the same element 

(nondegeneracy). 

4. d(M,N) < d(M,L)+d(L,N) for each triple (M, N, L) of 

elements of the set F (triangle inequality). 

We shall call the elements of the set E the points of the space (E, d). 
A metric space is thus completely determined by the choice of the set E 
and the function d—the distance function in the space. For the sake of 
simplicity, we shall denote a given space by the same letter as its 
corresponding set, although, in fact, the space and the set of its elements 
are quite different objects. In fact, it is often possible to define more 
than one distance function on a space E; each such function, along with 
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the set E, determines a different metric space. In chapter 4 we shall 
construct new definitions of distance (and thus new metric spaces) in 
the plane. 

In place of the four distance axioms listed above, it is possible to 
introduce only two (supposing as before that d(M, N) is a real number): 
1”. d(M, N) = O if and only if the points M and N are the same. 

2’. AM, N) < d(M, L) + d(N, L). 

First of all, these properties follow from properties 1, 2, 3, and 4, as 
property 1” is property 3, and property 2’ follows from the triangle 
inequality and condition 1. 

On the other hand, from properties 1’ and 2’ alone it is possible to 
deduce all of the conditions 1, 2, 3, and 4. 

To prove this, let us suppose first that in 2’, L = M, so that 


d(M, N) < (M, M) + d(N, M). 


By 1’, d(M, M) = 0. Therefore, d(M, N) < d(N, M). By interchang- 
ing in 2’ the positions of points M and N and carrying out the analogous 
argument, we see that d(N, M) < d(M, N). From these last two in- 
equalities we get the axiom of symmetry (1): 


d(M, N) = d(N, M). 
Substituting M for N and N for Lin 2’, we get 
AM, M) < dM, N) + d(M, N) = 2a(M,N), 
so that, by virtue of 1’, 


0 < 2d(M, N), 


implying 
0 < a(M,N), 


which is property 2 (nonnegativity). Again, using the condition of 
symmetry which we proved above, we can interchange N and L in the 
second term on the right side of 2’ and get the triangle inequality 4. 
Thus, the system of axioms 1’ and 2’ is equivalent to the system 1, 2, 3, 
and 4. It is more convenient to use the latter system, however, as it 
gives in a clearer form the same fundamental properties of distance. 
Still, it is interesting to note that all of these properties can be embodied 
in a pair of axioms. 
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From the point of view of the definition which we have introduced, 
the content of the preceding chapter might be described as a proof that 
the set of points in three-dimensional Euclidean space along with a 
distance function defined as the length of the line segment joining a 
given pair of points is a metric space. In the end of the same chapter, 
we established that the set of points on the surface of a sphere, together 
with the distance function ds, form a metric space. 

We can get another example of a metric space if we consider the set 
of points of some surface ~ in three-dimensional space and define the 
distance d,(M, N) as the minimum length of the paths passing along 
the surface 7 and joining the points M and N.? The first three properties 
of distance are then immediately evident. 

The triangle inequality can be verified in the following manner: Let 
us connect the points M and L, as well as the points L and N, by a path 
of the shortest possible length. Let us then connect the points M and N 
using such minimal paths ML and LN. Clearly, the length of this path 
cannot be less than the length of the shortest path joining M and N, 
since this path is itself a path joining M and N, and thus must be at 
least as long as the shortest such path. Since the length of this path is 
d,(M, L) + d,(L, N), and the length of the shortest path between M 
and N is d,(M, N), the desired relation follows: 


dM, N) < d(M, L) + d(L, N). (3.1) 


Let us note that on the surface of the sphere the shortest path joining 
two points is the smaller arc of the great circle determined by them; this 
was proved at the end of the preceding chapter. The proof was based on 
the fact that the triangle inequality was obtained by an independent 
argument concerned only with the space determined by the surface of a 
sphere, and on our proof that equality holds in the triangle inequality if 
and only if L lies “between” M and N on the smaller arc of a great 
circle, 

It is useful to introduce the concept of line segment in an arbitrary 
metric space. We shall define the line segment joining the points M and 
N in a metric space E to be the set Ey,” of points L which satisfy the 
equality 


d(M, N) = dM, L) + d(L, N). (3.2) 


It is easy to see that for ordinary distance in the plane or in three-space, 
the set Ey y coincides with the line segment MN in the ordinary sense of 


2. For the sake of simplicity, we suppose that for each pair of points M and N 
on a given surface z, there exists some shortest path between M and N. Using 
certain assumptions concerning the properties of the surface rr, it is possible to 
prove this supposition. 
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the term. On the sphere S with the distance function d, introduced in 
chapter 2, the segment Ey, y is the smaller arc of the great circle joining 
the points M and Nif M and N do not lie on the same diameter, and the 
whole sphere if the points M and N are antipodal. 

We leave it for the reader to verify that with the distance d,(M, N) 
introduced above, the line segment Ey, y (if it is indeed a unique path) 
is the shortest path (the so-called geodesic line) joining the points M 
and N. 

It is also possible to generalize to an arbitrary metric space E the 
concept of the sphere Sy, with center M and radius r as the set of points 
N for which d(M, N) = r. 

In the plane, this notion corresponds to that of a circle; in three-space, 
to that of the ordinary sphere; for the metric (distance function) ds, to 
circles on the sphere S. 

As still another (trivial) example of a metric space, we take an arbi- 
trary set E and define the distance between two points M and N to be 
zero if they coincide, and one otherwise. It is easy to see that all of the 
necessary conditions are satisfied by this definition. 

Various other examples of metric spaces will be examined in chapter 4. 

In a metric space E it is always possible to define the concept of 
convergence to a limit for a sequence contained in E. Roughly speaking, 
a sequence of points in the metric space E, (L,,Lo,...,Ly,..-), 
denoted by (L,)7-1, is said to converge to the point L € E if, beginning 
with some Lẹ, the distance between members of the sequence and the 
point L (the limit) becomes smaller than any previously chosen positive 
number. 

Formally, the sequence (L,)?-, is said to converge to L if for every 
positive real number e it is possible to choose a positive integer n(e) such 
that the condition k > n(e) implies 


AL, Ly) < e. 
In keeping with the ordinary notation, we write 


L= lim Ly. 


kon 


It is easy to verify that for the metric space consisting of all real 
numbers R with a metric d defined by 


d(x, y) = |x — yl, 


our general definition of limit coincides with the usual one. 
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For the metric space R?, Euclidean three-space with the usual metric, 
the concept of limit just defined allows us to state clearly what we mean 
by the limit of a sequence of points in three-space. 

Let us note that in this case the set of points M for which d(L, M) < e 
forms the interior of a sphere with center L and radius e. A sequence of 
points (L,)@-, thus converges to the point L if and only if, for each 
e > 0, there exists some integer n(e) such that all the points Lẹ of the 
sequence with k = n(e) lie in the interior of the sphere with center L and 
radius e. 


THEOREM. If the sequence of elements Ly, La, ..., Ly, ... of the metric 
space E converges to a limit L, then for each e > O there exists an integer 
ne) such that the conditions k > m(e) and k' > m(e) imply that 
ML e, Lg) < e. 


Proof. By the definition of limit, it is possible to choose an integer 
n(e/2) such that k > n(e/2) and k’ > n(e/2) will imply the inequalities 


€ 


LoL) < 53 dlrs L) < 5° 


But, by the triangle inequality and the axiom of symmetry, 
Lg Lip) < Lrs L) + AL, Ly) = ALi L) + Ly L) <5 45 = 8. 


In other words, if we let m(e) = n(e/2), then for k > m(e) and k’ > mie), 
the following inequality holds: 


A Lys Ly) <E. 


This proves the theorem. To paraphrase slightly, we have proved that 
if elements of a sequence become arbitrarily “close” to a given limit, 
they also become arbitrarily “close” to each other. 

If in the space E the converse of the above theorem holds, then E is 
called complete. 


It is convenient to give the definition of a complete metric space in 
the following form: A sequence of points (L,)?-, contained in the 
metric space E is said to be a Cauchy sequence if for each e > O, there 
exists an integer m(e) such that k > m(e) and k’ > m(e) implies 


AL, Ly) < eE, 
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The metric space E is called complete if each Cauchy sequence in E 
converges to a point of E. 

The real line, the plane, and three-space with their usual metrics are 
complete metric spaces. 

The question of whether or not a given metric space is complete is 
fundamental to the application of these concepts in mathematical 
analysis, but we shall not concern ourselves with this question at the 
present time.? 

Two metric spaces are said to be isometric if it is possible to set up a 
one-to-one correspondence between them such that the distance between 
a pair of points in one of the spaces is the same as that between the 
corresponding points in the other space. From the point of view of the 
theory of metric spaces, two isometric spaces may be considered 
identical. 

As an example, let the space E be the plane along with the ordinary 
metric, and the space E” the set of complex numbers z with a metric d’ 
defined by the formula 


d'(z, 2,;) = |z - zıl. 


The usual method of picturing the complex numbers as points on the 
plane establishes the existence of a one-to-one correspondence between 
the two spaces. It is easy to check that this correspondence is an 
isometry, since, if we set z = x + yi and z} = x, + y,i, the quantity 


|z - z| = V(x — x)? + (y — yı)? 


is equal to the distance between the corresponding points of the plane. 

The definitions of metric space and of distance given here are not the 
most general encountered in mathematics. There are various generaliza- 
tions of this concept. For instance, it would seem possible to assign 
infinite distance to some pairs of points, while still preserving all of the 
properties of distance. This generalization, as we shall see in chapter 9, 
is not particularly interesting. In many mathematical problems it is 
necessary to deal with a metric in which the property of symmetry is 
lacking. We shall study the properties of such a metric in chapter 9. In 
the theory of relativity, it is necessary to consider a distance function 
which can take on even imaginary values. The properties of such a 
distance are quite unique, but we shall not touch upon them in this 
book. 


3. The notion of completeness is of most importance to mathematical analysis 
when applied to metric spaces whose points are functions. See, for example, the 
definition of the space C at the end of chapter 7. 


4 Some Examples of 
Metric Spaces 


In this chapter we shall look at a number of examples of metric 
spaces with relatively unusual metrics. 

Many interesting metric spaces on the plane arise out of considera- 
tion of differently defined distance functions. We shall represent the 
points of the plane in this discussion with the aid of a coordinate system 
chosen once and for all so that each point of the plane is given by an 
ordered pair of coordinates (x, y). It will be convenient to denote a 
point of the plane as M = (x, y). 

The metric space / results when we define the distance between the 
points M = (x, y) and N = (xı, yı) by the formula 


d(M, N) = |x — x| + ly — yl. (4.1) 


Figure 2.2 (p. 5) shows that d,(M, N) is the sum of the lengths of the 
legs of the triangle MLN, in which MN is the hypotenuse and the legs 
ML and LN are parallel to the axes of the coordinate system. Since the 
length of the hypotenuse cannot exceed the sum of the lengths of the 
legs, we have always 


AM, N) < d(M, N), (4.2) 


where d(M, N) is the usual planar distance. The inequality (4.2) 
becomes an equality only when the line segment MN is either horizontal 
or vertical—that is, when it is parallel to one of the coordinate axes. 
If in inequality (4.2) we substitute the algebraic expressions for the 
corresponding distance functions (4.1) and (2.2), we get the inequality 


V(x — x1)? + (y — ys)? < |x xl + ly — yr] - 
Setting x; = y, = 0, we get the simple but important inequality 


Vx? + y? < |x| + |p|. (4.3) 
18 
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Axioms 1, 2, and 3 are obviously satisfied by the metric 4,(M, N). In 
order to verify that axiom 4 is also satisfied, we examine three points 
M = (x, y), N = (xı, yı), and L = (xg, y2) and write the elementary 
identity 


|x — x| + |y — val] = |x — x2 + x2 — x| + |y — Y2 + Ya — yıl. 
(4.4) 


Using the fact that for arbitrary real numbers a and b, |a + b| < 
la] + |b|, from (4.4) we get the inequality 


|x — x] + |y — yal < |x — xl + [xo — x1] + |y — yal + | ya — yıl, 
which is the desired relation 
d,(M, N) < 4,(M,L) + d(L, N). (4.5) 


And so the triangle inequality holds for the space /. 

The distance d,(M, N) can be interpreted as the length of a minimal 
path traversed by a particle moving from M to N that is constrained to 
move only along line segments parallel to the coordinate axes. Figure 
4,1 makes it evident that there are many (in fact, infinitely many) such 
minimal paths. 

It is not hard to show that this statement is equivalent to saying that 
in the space / there exist infinitely many distinct line segments} joining 
the points M and N (except in the case where the points M and N are 
situated on the same vertical or horizontal line); for a line segment in 
the space / joining the points M and N is any broken line joining M and 
N which consists only of vertical and horizontal lines which do not 
intersect any vertical or horizontal line more than once. (We suggest 
that the reader prove this as an exercise.) 











Fig. 4.1 Fig. 4.2 


1. In the sense of the definition introduced in Chapter 3. 
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One gets a still more natural picture by considering the metric space 
C consisting of all the lattice points of some rectangular lattice in the 
plane (fig. 4.2) with the metric defined by formula (4.1). Points of this 
space can be viewed as the intersections of the streets of a perfectly 
planned city. The distance d,(M, N) is in this case the length of the 
shortest path which one can take along the streets of the city from the 
intersection M to the intersection N, without taking any shortcuts 
through houses. 

In the following example, the space C will consist of points in the 
plane with the metric de? defined by the formula 


d..(M, N) = max (|x — xl, |y — yal), (4.6) 


where M has coordinates (x, y) and N has coordinates (xı, y1). Geo- 
metrically (fig. 2.2), the distance d„.(M, N) can be interpreted as the 
length of the larger leg of the triangle MLN. As this length is always less 
than that of the hypotenuse (or equal to it in the case of a degenerate 
triangle), we have 


d(M, N) < AM, N), (4.7) 


where d(M, N) is the usual planar distance. Again, setting x, = y, = 0, 
we get the algebraic inequality 


max (|x|, |y) < Vx? + y?. (4.8) 

For the metric de, axioms 1, 2, and 3 are again quite evident. To 
prove the triangle inequality, suppose we have three arbitrary points 
M = (x,y), N = (xı, yi), and L = (xə, Y2) We may assume that 


|x — xı| > |y — yıl.” This means that 


d..(M, N) = max (|x ~ xl, ly = yıl) = |x = xl 


|x — Xp + Xx2 — X|. 
Consequently, 
do(M, N) < |x — xo] + |x2 — xl. (4.9) 
2. The meaning of the symbol © will be made clear on page 22. 
3. We can make this assumption without loss of generality, for in the opposite 


case (lx — x1] < |y — yıl), we interchange the roles of the x and y coordinates 
and carry out the same proof. 
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Moreover, it is evident that 


|x — x2| < max (|x — xal, |» — yal) = do(M, L), (4.10) 
[xo = x1| < max (|x2 = xıl, lye = yıl) = de(L, N) ’ l 


Combining (4.9) and (4.10), we get 
d.(M, N) < do(M, L) + d«(L,N), (4.11) 


the desired result. 

We have already noted that in an arbitrary metric space it is possible 
to introduce the concept of a sphere of radius r with center M, defined 
as the set of points N for which 


d(M, N) =r. (4.12) 


If the distance function d is the 
ordinary distance on the plane, this 
sphere is just the circle with center M 
and radius r. 

For three-space with the ordinary 
metric, the sphere defined by (4.12) is 
just the ordinary sphere with center M 
and radius r. 

In the space / the sphere is a square 
with center M and diagonals of length 

Fig. 4.3 2r parallel to the coordinate axes. 

In the space C the sphere is also a 
square with center M, but with sides of length 2r parallel to the co- 
ordinate axes. In figure 4.3 we have pictured the sphere of radius r in 
the spaces / and C and in the usual sense. The proof that spheres in / 
and C have the form indicated above is left as an exercise for the reader. 

An interesting class of metric spaces is obtained when we define a 
metric d, on the plane by the formula 


d,(M, N) = Y |x — x,|? + ly — yl”. (4.13) 


The spaces so obtained are called /, spaces. 
Axioms 1, 2, and 3 for a metric d, are obvious. The triangle inequality 
follows from Minkowski’s inequality: * 


IF 
Wa + al? + |b + by)? < Wal? + [bl]? + vail? + lbi, (4.14) 


4. A proof of Minkowsky’s inequality can be found in Geoffrey H. Hardy, 
John E. Littlewood, and George Polya, Inequalities (Cambridge: The University 
Press, 1934). 
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which is true for p > 1, if for the points M = (x, y), N = (xı, yı), and 
L = (Xe, Y2), we take 


=X — X23 A = X3 — X; b=y— ya; bi = y2 — yı. 


For p < 1, the triangle inequality is not true; the inequality in (4.14) is 
reversed. 

It is easy to see that for p = | the distance d (M, N) = d,(M, N), 
whereas for p = 2 the distance d (M, N) is just the usual distance 
d(M, N). Thus, the space / coincides with the space /,, and the plane 
with the usual metric is the space lo. 

We shall now show that the distance d,(M, N) converges to the 
distance d,.(M, N) as p >00. 

Let us first examine the case |x — x,] > |y — yi]. Then d„(M, N) = 
|x — x,|. On the other hand, transforming (4.13), we have 


_ p y— yF 
d (M, N) = |x — x| 2/1 + 42] . (4.15) 
X — Xi 


Noticing that for p > 1, 
— p 
s41 + Ba Al <ı+ 
Xx — x 


and that the quantity |(y — y.)/(x — x)|” >0 as p=>00 (since 


lx —x11 > |y — yr] and, thus, |y — yılllx — x| = |y — WI 
(x — x)| < 1), we get 


_ P 
lim »/ı + =| =1. 
p> o X — Xi 


Using (4.15), we see that 








p 


y— yı 
X— Xi 














lim d,(M, N) = |x — x,| = d.(M, N). (4.16) 
p>w 


Analogously, for |x — x,| < |y — yıl, we obtain 


lim d,(M, N) = | y — yıl = d«(M, N). (4.17) 
p> w 
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Finally, let us examine the case |x — x,| = |y — yıl. Then 
d(M,N) = Y lx — x|? = |x — xı|V2. 


Since lim 42 = 1, we have in this case 


p> 


lim d,(M, N) = |x — x| = da(M, N). (4.18) 
pow 


And so in all three cases, by (4.16), (4.17), and (4.18), we get 


lim d,(M, N) = d.(M, N), (4.19) 
p>w 


the desired result. 

Consequently, it is reasonable to denote the space C by the symbol 
lo, since the distance d.(M, N) in this space is the limit of the distances 
d,(M, N) as p approaches infinity. 

Figure 4.4 depicts J, spheres (all 
having the same center M) for various 
values of p. The /, metric spaces are also 
called Minkowski spaces. In chapter 7 
we shall examine multidimensional 
Minkowski spaces. 

We leave it to the reader to formulate 
a simple definition of line segment for I, 
spaces. 

We can obtain an interesting class of 
metric spaces in the plane by defining 
distance as the minimum time required 
Fig. 4.4 to travel from M to N with some given 

restrictions on the paths which may be 














taken. 


Making no restrictions, we can obtain the usual distance if the 
shortest path from M to N is taken by a point moving with a constant 
velocity of one. 

We can obtain the metric space / if we require this point to move again 
with constant velocity but only along line segments parallel to the 
coordinate axes. 

But we get a new example if (see fig. 4.5) we consider the map of the 
Moscow metropolitan area and suppose that a traveler may go from 
point M to point N in the following manner. 
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Fig. 4.5 


If the same subway station is the nearest one to both points, the 
shortest route is on foot. If this is not the case, the traveler walks (by 
the shortest route) to the station closest to the point of departure M, 
tides by the shortest route to the station closest to the point N, and from 
there walks to N. If two or more subway stations are equally close to M 
or N, the route for which the riding time will be least is chosen. Figure 4.5 
shows two pairs of points, (M, N) and (M,, N,); to go from M to N one 
must walk, whereas to go from M, to N, one must take a subway. Let 
us suppose that someone living between the Rizhskii and Botanic 
stations wants to go somewhere in the neighborhood of the Zemlyanıi 
Val; then it would be necessary to get on at the Botanic station and go 
to either the Lermontov station or the Kursk station. It is easy to see 
that the metric d,(M, N) defined in this way is, in general, different from 
the usual geometric distance. In fact, if the point Q is situated near a 
heloport (either Dynamo or Aeroport), the point P near Volokolamskii 
Highway, and the point R near Valovaya Street (near the Paveletskii 
subway station) as in figure 4.5, then in the sense of ordinary distance 
the point P is somewhat closer to the point O than is R: 


a(P, Q) < dQ, R). 


It is evident from figure 4.5, however, that 


a(P, Q) > a(Q, R) . 
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Actually, if one cannot take a taxi, it is possible to travel from the 
heloport to Valovaya Street in less time than it takes to go from the 
heloport to Volokolamskii Highway. 

For the metric d,, axiom 1 (the axiom of symmetry) is nontrivial. 

The equality 


d(M, N) = d(N, M) 


indicates that the time spent in going from M to Nas quickly as possible 
is the same as that spent going from N to M. This is more or less true if 
one uses only the subway or travels only on foot. But if taxis are allowed, 
this is no longer true; it is one thing to try to get a taxi at a taxi stand 
and an entirely different thing to try to get one in some remote neighbor- 
hood or at the Kursk station when the trains are coming in. 

Axioms 2 and 3 for the metric d, are evident. The reader will have no 
trouble proving the triangle inequality (axiom 4) for himself if he 
recalls the proof of this axiom for the metric d, in chapter 3. 

In further investigation into the properties of metric spaces, it will be 
useful to introduce the concept of a Dirichlet region. Let E be a metric 
space and L,, Lo,..., L, points in E. We define the Dirichlet region of 
the point L, to be the set of all points N for which 


d(Li, N) < d(L;, N) (4.20) 


for all j 4 i, and denote this set by D;. In other words, the Dirichlet 
region D; is the set of points which are at least as close to the point L; 
as to any of the other given points L,. It is clear that the Dirichlet region 
is determined by the choice of the points L,, Lo, .. ., Ly and of the given 
point L, We shall now look at examples of Dirichlet regions in various 
metric spaces. 


























Fig. 4.6 Fig. 4.7 
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Let us first consider the plane with the usual metric d, and two points 
L, and L}. We join these points by the line segment L, La (fig. 4.6) and 
draw a perpendicular through its midpoint. This perpendicular divides 
the plane into two closed half-planes, which are the Dirichlet regions for 
the points L, and Lo. 

Let us now consider three points L,, La, and Ls in the plane, again 
with the usual metric. In figure 4.7 we have constructed the Dirichlet 
regions for these three points and marked them off with heavy lines. 
The method of construction is clear from the diagram. 

Let us now examine two points L, and L, in the plane with the metric 
d,(M, N) (that is, in the space /). For the sake of clarity, we shall again 
visualize a city divided into squares. The Dirichlet regions consist of 
those intersections from which the route through the city to L, will be 
shorter than that to L, and vice versa. These regions are marked off in 
figure 4.8 by a heavy line. Figure 4.9 shows the corresponding partition 
for the space C. We suggest that the reader try to derive the general rule 
for constructing the Dirichlet regions for n points in the spaces / and C 
by examining Dirichlet regions for two points and for three points. 
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Fig. 4.8 Fig. 4.9 


Turning again to figure 4.5, we see that if we partition the space into 
Dirichlet regions for the pair of points P and R, then the point Q falls 
into the Dirichlet region of the point R. We suggest that the reader draw 
this partition into Dirichlet regions. It is important to note that this 
partition differs greatly from those in figures 4.6, 4.8, and 4.9. 


5 The Space of 
Information 


When we speak of communication, we usually mean some sort of 
transmission of information. In this sense, communication appears in 
the form of books, letters, telegrams, musical pieces (recorded or 
written in musical notation), computer cards, signals directing the 
flight and landing of space ships, molecules of deoxyribonucleic acid 
(DNA) which transmit genetic information from parents to offspring, 
and so on. 

Questions concerning the transmission and codification of informa- 
tion are examined in the theory of information.’ In the study of this 
theory, methods for determining the “‘quantity of information’ con- 
tained in a given message are developed; this “quantity” can itself be 
encoded as information. We frequently encounter this situation in our 
daily lives; in composing a telegram we try to use the minimum number 
of words possible without destroying the meaning (that is, while 
preserving the quantity of information). 

The reverse situation arises when, in an examination or in a seminar, 
a poorly prepared student amplifies his message, trying to express the 
small amount of information which he has on his topic in a sufficiently 
impressive quantity of words. 

A surplus of communication relative to the quantity of information to 
be transmitted is, however, not always harmful. Such redundancy can 
be useful when interference arises in the transmission of information. 

For example, when we have a bad connection on the telephone, we 
are forced to repeat individual words. In conveying strange or difficult 


1. A good reference for an account of information theory is A. M. Yaglom and 
I. M. Yaglom, Veroyatnost’ i informatsiya [Probability and Information] (Moscow: 
State Publishing House of Physics and Mathematics Literature, 1960). A transla- 
tion of this work will be included in the Survey of Recent East European Mathe- 
matical Literature of the University of Chicago. 
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names, we use the following alphabetical device: In communicating the 
name “Pavsikakii”” over the telephone, for example, we might say, 
“Peter, Anne, Victor, Susan, Irene, Karen, Albert, Kay, Ivan, Ida.” 

In this chapter and in the next we shall study methods of error- 
stabilizing codification of information, without concerning ourselves 
with specific questions relating to the theory of information. In other 
words, we shall study methods of writing down messages that allow us 
to correct automatically any errors that arise, provided that they are not 
too numerous. These methods are closely connected with the question 
of the possibility of defining a metric on the so-called space of in- 
formation. 

The idea of these methods is something we make use of frequently in 
everyday life—for instance, in reading books with printing errors and 
receiving telegrams with mistakes. If we read the word “sauce pin” in 
a book, we need not look in any “‘dictionary of mistakes?” in order to 
guess its meaning. There is very little chance that the author meant the 
word “telegraph” here. For if he did, we would be dealing with eight 
misprints in a row, whereas if the word “sauce pan”” was meant, there 
would be only one misprint.” 

Still, there are curious examples where a totally different word can 
arise from a mistake in only one letter. For example, the Russian word 
“korona” (“*crown””) could be mistakenly written as “ korova ”? (“cow”’ 
or as “vorona” (“crow”). 

Indeed, a well-known anecdote is based on this situation. A Russian 
provincial newspaper is said to have printed this sentence in an article 
about the coronation of Nicholas II: “The Metropolitan placed the 
crow on His Highness’s head.” The next day a correction was published: 
“The Metropolitan placed the cow on His Highness’s head.” 

Clearly, even here it is quite easy to determine the true meaning of the 
message from the context. 

Analogously, a misprint in a musical composition can frequently be 
discovered because of its false sound and can be corrected by the laws 
of harmony. 

One must mention that errors can arise not only in transmission of 
information, but also during its storage, for example, in the memory of 
an electronic computer. The problem of discovering the correct message 
is the same for errors occurring during the storage of the message as for 
those arising during transmission. 

Every type of message is written with the aid of some set of symbols. 
The set of symbols used forms an alphabet A. We assume that this 


2. Of course, sometimes there are more probable strings of misprints, arising 
from a typist’s or typesetter’s misunderstanding of the sense of certain words. 
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alphabet is given beforehand and consists of a finite number of symbols. 
For example, the alphabet might consist of all Russian letters, a space, 
and punctuation marks. Using this alphabet, it is possible to write any 
arbitrary Russian sentence. Another example of an alphabet is the set 
of all decimal digits, algebraic symbols, punctuation marks, and Latin 
and Greek letters. Using such an alphabet, one can write down the most 
diverse of mathematical formulas. 

Still another example is the binary alphabet—a set of two symbols, 
Ala = {0, 1). Using such an alphabet, we can write any number in the 
binary system. 

It is easily verified that any whole number x can be written in the 
form 


X = 6,2" + e,-12 71 4+---4+ 62 + &, (5.1) 


where the quantities e, take on a value of 0 or 1. 

Thus, to transmit information about an integer x, it suffices to trans- 
mit a finite sequence of symbols of the alphabet 2l,: €n, €n-1,..., £15 E0- 
In order to separate the information about two different numbers, it is 
necessary either to introduce a special symbol for the end of a number 
or to transmit only sequences of some standard length.? 

The latter method is the one actually used on computers, where the 
binary sequences to be stored in memory usually have a standard length 
corresponding to the number of “memory cells’ available in the 
machine. In computers now being manufactured, however, this principle 
is being departed from more and more, with memories of variable 
length being used. 

Formula (5.1) is analogous to the well-known formula 


x = a,10" + a, -110771 +--+ 4,10 +a, (5.2) 


where ar, 4,1, ..., 4, Ao are the digits in the decimal representation of 
the number x. It is easy to generalize equality (5.1) to numbers which 
are not integers exactly as is done for decimal fractions. 

Let us determine the connection between the number n and the value 


of xin (5.1). Clearly, if the leading coefficient is equal to zero, the leading 
term can be discarded; this process can be carried out repeatedly until 


3. There are more complicated methods for separating the meaningful units 
(words) in an arbitrary alphabet. See, for example, the article by A. A. Sardinas 
and George Patterson in Kiberneticheskii sbornik [Journal of Cybernetics], no. 3, 
Moscow, 1961. 
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e, = 1. Switching all terms except the first, from the right side to the 
left, we get 
x — En-12"7* u Ep 22" °- ° -—812* — & = Ep2" = 2”, 


This makes it clear that 


or 
n < logs x. (5.3) 


On the other hand, the following inequality holds: 


£,2" + En 12771 + En-922%72 +. +e2!+% 
< 28 4 29149224. ..4 214 1=2%t1— 1]. 


From this relation and from (5.1), it follows that 


x<Q2n+1 _] 
or 


x<2"tl, 
which can also be written 
n+1>logx. (5.4) 
Combining (5.3) and (5.4), we obtain the inequality 
n<logx<n+l. (5.5) 
The inequality (5.5) can be written as 
n = [log, x], 
that is, n is equal to the greatest integer in log, x.* The above statement 


leads us to conclude that the number of binary symbols required to 
code all integers in the range 0 < x < ais 


1 + [logs a] =1 +n. (5.6) 
4. By the greatest integer in the number a we mean the largest integer which is 


less than or equal to a. The greatest integer in a is written [a]; for example, 
[7] = 3. 
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The one is included here because there are n + | terms in the sequence 
Ens En-1> + «+ -> El, Eq. 

With the aid of the binary alphabet %,, any type of information 
(numbers, commands, logical relations, and so forth) can be written 
into the memories of computers.® 

By a message in a given alphabet 9 we shall mean a finite sequence of 
symbols from this alphabet. It is sometimes convenient to divide a 
message into standard submessages, which are called words. 

Generally speaking, it is possible to define infinite alphabets and 
messages, but we shall not consider them here. 

A message written in one alphabet can sometimes be translated into 
another. For example, as we have already seen, an integer represented 
by its decimal digits can also be written in the binary alphabet. One of 
the important examples of such translation is the following: Suppose 
that we are given an alphabet YA. We define a new alphabet U’ to be the 
set of all words of length less than or equal to some positive integer k 
which can be formed using alphabet 2. It is clear that every message in 
alphabet A can be broken up into a sequence of words of length not 
greater than k, which means that it can be recoded in the new alphabet 
X. 

A similar idea could be introduced for messages in the Russian 
language, written in the Russian alphabet supplemented by a space and 
punctuation marks. Here it would be necessary to take a complete word 
list of the Russian language and to assign to each word in it a hieroglyph 
(using, for example, a combination of Chinese and Egyptian writing). 
If one could, in addition, introduce hieroglyphs in such a way that it 
would be possible to distinguish cases and conjugations of verbs, then 
one could recode any message in the Russian language. 

In place of hieroglyphs one might use decimal numbers of six digits. 
The first five digits of such numbers would suffice for coding words ;® 
the sixth digit could be used for coding grammatical signs. 

Here we have for the first time stumbled upon the important notion 
of coding and recoding messages. By codification we mean, generally 
speaking, the formation in a given alphabet of messages containing 
given information or the translation of a message written in one alpha- 
bet into a message written in another. In this respect, ‘‘one-to-one”’ 
translations, that is, cases in which it is possible to transform the 
information of a message from one language to another in an essentially 
unique way, are of most interest. It is easy to see that the translation of 


5. On this point, see Donald E. Knuth, The Art of Computer Programming, 
vol. 1 (Reading, Mass.: Addison-Wesley, 1969). 
6. As one could easily make do with a vocabulary of 100,000 Russian words. 
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Russian sentences from the alphabetical to the hieroglyphic form has 
this property. 

In practice, this method of encoding messages by words is used along 
with a method of decoding by means of a word alphabet. 

The reverse situation also occurs, in which a symbol from a given 
alphabet A is coded in the form of a word written in a simpler alphabet 
U’. For example, suppose an alphabet consists of three symbols {-, —, *} 
(dot, dash, end of letter). Then an arbitrary letter or punctuation mark 
can be written in this Morse code (see table 5.1) as a word of at most 
seven symbols from the alphabet 2’. 


Table 5.1 
The Morse Alphabet 


Morse Latin Morse Latin Letters 
Symbols Letters Symbols (and Arabic Numerals) 
_ A _ V 
— B == WwW 
=.- Cc —=..— X 
_ D =.-- Y 
E =- Z 
— F = .———— 1 
—— G wea 2 
H .__ 3 
I on 4 
——— | en 5 
—.— K —.... 6 
- L ——... 7 
=- M ==. 8 
— N = ----. 9 
==- O ----- 0 
=- P see , (Comma) 
——.— Q eessen . (period) 
- R sae ; (semicolon) 
S =. : (colon) 
— T +s ? (question mark) 
— U -.-- ! (exclamation point) 


The marks “*” and “**” for the end of a letter and the end of a 
word, respectively, are coded by intervals of time and, therefore, 
are not included in the table. 


In this way, English words can be written in the Morse alphabet 
instead of the Latin alphabet. 
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Example. The English sentence, “What is distance?” can be written 
as follows in the Morse alphabet: 


.o —-—oo 


As it happens (fortunately for computer technology), any message in 
an arbitrary finite alphabet can be recoded in the binary alphabet 
A, = (0, 1). 

Any nonnegative integer can be represented in the form of equation 
(5.1); that is, in the binary system and, therefore, as a word in the 
binary alphabet. 

If we consider only integers in some range 0 < x < a, the sequence 
of binary symbols for x ~ eng,-,: “egg can have no more than 
1 + [log, a] terms (5.6). 

Now, if we have an arbitrary finite alphabet Y consisting of m 
symbols, we can assign to each symbol an integer between 0 and m — | 
inclusive. And so, to each symbol of the alphabet 2 it is possible to 
assign a binary word, corresponding, in accordance with (5.1), to the 
number associated with that symbol. Moreover, it is possible to make 
do with words of length n, where 


loga (m — 1) < n <1 + loga (m — 1). (5.7) 


In this way it is possible in the case of any finite alphabet to limit 
oneself to words in the binary alphabet. Modern telegraphy employs an 
international telegraphic code for Russian and Latin letters, numerals, 
and punctuation marks. As an example, we introduce in table 5.2 the 
five-symbol code used in telegraphic apparatus of type CT-35.7 

The last five combinations are read in the same way in all registers. 

The symbols of the registers indicate that after the appearance of, let 
us say, the symbols of the Latin register, all binary five-symbol com- 
binations are read as Latin letters. In order to switch to Russian letters, 
one must insert the symbol for the Russian register. 

Example. Let us write the following sentence in our telegraph code: 
“The name Shakespeare is written Illexcnup in Russian.” 


7. At present the so-called “international telegraphic code No. 2” is being used 
more and more. The following code, a variation of the “international telegraph al- 
phabet No.1” for multiplex systems, is based on an analogous principle. 
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00001 
10001 
10000 
01100 
01000 
01111 
00010 


10101 
00101 
00111 
10101 
10011 
10001 
00101 
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11010 
11010 
01000 
10101 
00101 
00111 


International Telegraphic Code for Russian and 


Latin 


Register 


KRREOETGHURWOZZTKR-TN<MUOEH> 


01000 
10000 
10001 
01000 
11000 
10100 


10001 
10011 
01100 
01111 
01100 
00101 


Table 5.2 


01111 
01000 
00101 
10001 
00111 
00101 


Latin Letters 


Numerical Russian 
Register Register 
1 A 
8 B 
? B 
7 T 
0 I 
2 E 
? K 

: 3 
II Y 
6 ú 
( K 
= JI 
) M 
IO H 
5 O 
II 

— P 
C 

T 

4 y 
3 D 
+ X 
9 u 
/ IH 
, b 
3 bI 
A 


Russian Register 
Numerical Register 


Latin 
blank 
bell 


Register 


10000 
00101 
10001 
00010 
10001 
01100 


Code 


01011 
11000 
01101 
01100 
00001 
10000 


Combination 


10000 
00110 
01101 
01010 
11110 
01000 
11101 
11001 
01100 
10010 
10011 
11011 
01011 
01111 
11100 
11000 
00111 
00101 
10101 
10100 
01110 
11010 
10110 
10111 


01000 
01000 
00111 
11111 
01100 
01111 
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In the coded text the symbols for the registers are in heavy print, for 
it is necessary to notice them in order to be able to change from Russian 
to English, to arrange the punctuation marks, and to write down the 
letter “w.” The latter is placed in the numerical register since there are 
more Russian than Latin letters. 

A five-digit binary code suffices for the representation of all Latin (or 
Russian) letters. Such a code is given in table 5.3. 


Table 5.3 

a 00000 h 00111 o 01110 u 10100 
b 00001 i 01000 p 01111 v 10101 
c 00010 j 01001 q 10000 w 10110 
d 00011 k 01010 r 10001 x 10111 
e 00100 1 01011 s 10010 y 11000 
f 00101 m 01100 t 10011 z 11001 
g 00110 n 01101 


The sentence “The length of the hypotenuse is less than the sum of the 
lengths of the two legs” can be coded as follows: 


10011 00111 00100 11010 01011 00100 01101 00110 10011 
00111 11010 01110 00101 11010 10011 00111 00100 11010 
00111 11000 01111 01110 10011 00100 01101 10100 10010 
00100 11010 01000 10010 11010 01011 00100 10010 10010 
11010 10011 00111 00000 01101 11010 10011 00111 00100 
11010 10010 10100 01100 11010 01110 00101 11010 10011 
00111 00100 11010 01011 00100 01101 00110 10011 00111 
10010 11010 01110 00101 11010 10011 00111 00100 11010 
10011 10110 01110 11010 01011 00100 00110 10010 


Note that the separation of the five-digit strings is used here only for 
ease of reading and that the blank entry 11010 has been introduced 
as a space symbol between words. For storage of such a message in the 
memory of a computer or for transmission by means of telegraph, no 
symbols but zero and one are needed. 

To illustrate this point, let us suppose that the above text were written 
as a continuous string of zeros and ones. Then the first line (excluding 
space symbols) would read: 


100110011100100010110010001 1010011010011 


We could initially separate the first five symbols 10011 and write 
them down. Then we could separate the immediately following five 
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symbols 00111 and write them down. In this way we could generate the 
complete message by inserting spaces between strings (on the subject of 
separating words in messages, see note 3 on page 29). 

We shall now introduce the idea of a space of communication. Let us 
consider an arbitrary alphabet ® A and the set of messages consisting of 
exactly n symbols from the alphabet A. 

We define the distance d(€, y) between two messages é and y to be the 
number of positions in which the messages € and y have different 
symbols. The metric space E(n, A) obtained in this way is called the 
n-dimensional space of communication over the alphabet 2. 


Example 1. Ais the Latin alphabet, n = 5. Let € = build; y = guilt. 
All letters but the first and fifth coincide, and so d(€, y) = 2. 


Example 2. A, is the binary alphabet, n = 12 and € = 000110101010; 
n = 010101101011. The second, fifth, sixth, and twelfth binary digits 
do not coincide, and so d(€, y) = 4. 

Note that it is possible to compare any words of length not greater 
than n if it is agreed that words of less than n letters are augmented by a 
previously chosen letter until they are of length (usually 0 in the binary 
alphabet). 

Let us verify that the metric d defined above satisfies all of the 
necessary axioms. 

The axiom of symmetry d(£, y) = d(n, €) follows from the definition, 
in which the roles of € and n are interchangeable. It is obvious that 
d(é, y) > 0, and that d(€, y) = O only ifall of the corresponding symbols 
in the messages é and y coincide—that is, if the words £ and n are the 
same. 

The triangle inequality is verified as follows: Assume that we are 
given three words é, y, and £ of length n. Let us suppose that in the kth 
position, the symbols of words € and £ coincide, as well as the symbols 
of words £ and 7. Then it is clear that for this position the symbols of 
words € and y also coincide. 

To be concise, let €, be the kth symbol of message €; čą the kth 
symbol of message £; and 7, the kth symbol of message y. Then if 
E, = Ly and Lx = nx, Ek = ,. Taking the contrapositive, if €, Æ ny, 
then either €, 4 E, or €. Æ ny. 

Thus, words é and 7 can have different symbols only in those positions 
where either the symbols of words £ and £ or those of words ¢ and y do 
not coincide. This indicates that the number of symbols of £ and 7 
which differ does not exceed the sum of the number of noncoincident 


8. In this situation it is not even essential that the alphabet 2 be finite. 
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symbols of € and £ and those of and 7. But the number of symbols at 
which £ and £ do not coincide plus the number at which ¢ and y do not 
coincide is d(£, ) + d(£, q). In other words, 


dlé, n) < dé, 0) + dn), (5.8) 


the triangle inequality. 


Example. In the space E(5, A), where A is the Latin alphabet, let 
€ = trace, y = truce, and ¢ = trunk. Clearly, d(é, y) = 1, d(é, 9 = 3, 
and d(£, y) = 2; and so 


aE, n) < aE, 6) + dS, n). 


With the aid of this metric, it is possible to formulate a general 
principle for the construction of codes which allows us to correct 
mistakes automatically. This principle was first introduced by P. 
Heinming.? We shall examine it in the following chapter. 

9. See the article by P. Hemming in the collected translations Kody s obna- 


ruzheniem i ispravleniem oshibok (Codes and the Detection and Correction of 
Errors), IL, Moscow, 1956. 


6 Automatic 
Correction of 
Errors in 
Messages 


In this chapter we shall examine the space of communication E(n, 24); 
that is, the space of messages of length n formed in the alphabet 4. As 
we have already seen, it is possible to limit oneself exclusively to binary 
messages (messages in the alphabet 9,). All of our interesting examples 
will come from this alphabet. 

Let us consider the following general scheme for the transmission of 
information (fig. 6.1). Messages emanating from some source are 
recoded into an error-stabilizing code by means of a coding device. 
Then these messages are transmitted along connecting lines, during 
which time the messages might become distorted. Finally, the messages 
are corrected at the receiving end by a decoding device and decoded into 
the initial code if necessary. 






Source of Coding onnecting 
information device line 


Fig. 6.1 


The automatic detection and correction of errors during the storage 
of information in machine memory occurs in a completely analogous 
manner. 

As information is stored in the memory, it is translated into an error- 
stabilizing code. When the message is read, the corresponding decoding 
takes place, along with the correction of errors admitted during storage. 
By periodically reading, decoding, correcting, recoding, and storing 
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Source of 
information 
(input) 













Symbols 





Coding 
device 









Processed 
information 
Decoding | (output) 


device 


Fig. 6.2 





information, we can be sure of its accuracy. In particular, if we choose 
a period of time T during which not too many distortions of the stored 
information can arise, and repeat the above process no less frequently 
than once per period T of time, the accuracy of the stored information 
will be guaranteed. In other words, T must be chosen so small that the 
distance d(é, €’) between the stored message £ and the message £” that 
is read cannot become too great. 

We now choose a subset N, of E(n, A) with the property that for any 
two distinct elements £ and y of Np, 


dÈ, n) > k. (6.1) 


We shall call the set N, the set of intelligible words. Let us suppose 
that during the transmission and storage of the intelligible word ¿e N,, 
Į errors (with / < k — 1) are admitted—that is, / symbols of the word 
é are incorrectly given. This incorrectly transmitted word we shall 
denote as €’. By the definition of our metric, d(£, £) = I. Clearly, the 
word é’ is not intelligible, for if it were, d(é, £) would be greater than 
I—by (6.1). 

Thus, we may check the transmitted word €’ and see that it is not 
intelligible (it can, for example, be compared with all the words of N,— 
in figures 6.1 and 6.2 this possibility is guaranteed by the availability of 
a dictionary). We would then discover that an error had been made. 
While the word is in the machine memory, such a process of checking 
can take place periodically, where we choose the period T to satisfy the 
condition that during the time 7 there is little chance for more than 
k — l errors to arise in a single word. Thus, we already have a general 
principle for the detection of errors. 
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But we can do even more; we can actually correct the mistakes that 
arise. For this purpose we shall assume that the number of mistakes 
I < (k — 1)/2. Let y be an arbitrary intelligible word distinct from é, 
and é as before, an incorrectly transmitted word. 

Applying the triangle inequality, 


d(é,n) < dE, E) + d(€, m. 
Setting d(é, £) = l, and using (6.1), 
k<I+dE,n). 


From this it is clear that 


dermar12k- ELE, (6.2) 


since Z < (k — 1/2. 

From the assumption that / < (k — 1)/2 and from (6.2), we conclude 
that the incorrect word &’ is at most (k — 1)/2 away from the correct 
word £ and at least as far as (k + 1)/2 from any other intelligible word 
y. In other words, we find that the intelligible word £ is nearest to £’, 
and thus establish the correct message. 

The above discussion seems to suggest the usefulness of determining 
the Dirichlet region of each intelligible word. For each word &’ to be 
corrected, it must be determined to which Dirichlet region the message 
belongs. The intelligible word determining this Dirichlet region is 
considered the correct word. 

Here is where Hemming's remarkable idea comes in. This idea is 
based on the fact that for the purposes of transmission, one need not 
use all possible combinations of symbols from the alphabet, but only 
some set of intelligible words. Since in English only certain combinations 
of letters are used as intelligible words, the sense of distorted words can 
frequently be established without the use of additional codings. This has 
already been illustrated. 

We shall now examine the means by which error-stabilizing codes are 
constructed in practice; in particular, the construction of sets of 
intelligible words N, < E(n, 2). All of our examples will come from 
the binary alphabet %3. As we have already seen, such a condition is not 
a limitation, for it is possible to write any message in the binary 
alphabet. 
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The problem of error-stabilizing codification can be formulated in 
the following way. Suppose that we have the space of s-symbol binary 
messages E(s, 213). We must place in correspondence with each such 
message a message from some set N, © E(n, Aa). This set of intelligible 
words N, must be stable with respect to /-place errors. We shall call the 
quantity (n — s)/n the redundancy of the code. 

Since the exact formulation of this problem must involve the probable 
distortion occurring during transmission, it is necessary to construct the 
code (the set N, © E(n, 2l,)) in such a way that the probability of 
receiving more than / errors in a word of length » is sufficiently small. 
This more complete formulation of the problem is studied in informa- 
tion theory, but it need not concern us here. 

In the construction of these codes it is especially convenient to 
introduce and apply the concept of addition modulo 2; that is, according 
to the rules 


l, 
0. 


0@0=0, 180 
0®1=[1, 1@1 


I 


The circled plus sign indicates that the operation carried out is not 
ordinary addition. The distance between two binary words & = 
X1X2-**x, and y = y; Ya: Yn (where all x; and y; have the value 0 or 1) 
can now be written in the following way: 


dlé, 7) = (%1 O y) + (00 ya) ++ (Xn On) - 


Since ones will appear as terms in this sum when and only when 
corresponding symbols differ, the total will be exactly equal to the 
defined distance d(£, 7). 

Let us consider the space of messages E(n, 2%.) and associate with 
each word £ e E(s, A) a word £ of length s + l, formed according to 
the following rule: The first s symbols of the word €’ will be the same as 
those of the word é. The last ((s + 1)*) symbol of the word €’ is chosen 
in such a way as to make the sum (ordinary) of binary symbols in the 
word ¿£ even. In other words, if € = X1X2° ++ XsXs415 


110x000 X541 =0. (6.3) 


This equality (and some easily verified properties of addition modulo 2) 
allow us to express x,,, explicitly: 


Xs+1 = Xi O xa O Oxs. (6.4) 
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For example, if é = 001011101, then & = 0010111011. 

The words ¿ formed in this way define the set of intelligible words 
N, < E(s + 1, %,). 

It is clear that the distance between two distinct intelligible words ¿' 
and ¢” must be even, for if £ differed from ¿” in an odd number of 
positions, then the sum of the units in one of the words €’ or €’ would be 
odd, a situation made impossible by the construction of these words. 
And because the smallest even number not equal to zero is two, the 
minimum distance between distinct intelligible words is two; thus the 
subscript 2 is used. 

Consequently, this code allows us to detect errors of one digit by 
counting the nonzero digits; if the evenness criterion (6.3) is not satisfied, 
then the word contains an error. This error detection process is widely 
known as the evenness test and is very frequently applied because of its 
simplicity. The redundancy of the error detection code is 


We shall now consider a beautiful example of a code (due to 
Hemming) which is capable of correcting single-digit errors. 

Let ¿e E(s, A) be a binary word of length s. We form the word 
€ e E(n, A) according to the following rule: Among the n positions in 
the word €’, we choose the first, the second, the fourth, ..., the (2")th 
positions for controlling symbols which are determined by the word £. 
Between these positions are the significant positions. In the example 


€ = 100111001001 1011111101100010010, 


we have indicated the mutual distribution of the controlling (dis- 
tinguished by heavy type) and the significant positions for the case 
s = 26, n = 31, k = 4. In order to make s significant positions avail- 
able, the number of controlling positions (k + 1) plus the number s of 
significant positions must lie between the kth and (k + 1)* powers of 
2; that is, it is necessary and sufficient that 


2: <s+kr+]1x<2ett, (6.5) 


The redundancy of the given code is [(s + k + 1) — sl/s+k+1= 
(A+ D/(s + k + 1). 

The (i + 1)* controlling position (position 2*) is filled according to 
the following rule: Each position of the word £' is defined by a number 
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l, counting from the beginning of the word. We examine the binary 
representation of this number: 


P= 1,2 + l, 21 +---+ 12 + 1 


—the number of binary elements in the representation of the number / 
is defined so that, in accordance with (6.5), / < 2*+1, 

Let us now consider the set 7, of all those positions / for which } = 1. 
This set contains exactly one controlling position, the position with the 
number / = 2*, We fill this position in such a way that the sum of all the 
ones in the positions of m; will be even. 

In table 6.1 we give an example of a word £’, which can be read 
vertically in the second column. We have shown the binary number of 
the positions and marked with a star those positions belonging to the 
set m. Words é constructed according to this rule shall be called 
intelligible. 

We shall show that the distance between any two intelligible words 
£' and 7’ is not less than 3—that is, that the intelligible words form a set 
Ng S Els + k + 1, A). 

Case I. Suppose words é and y from E(s, Az) differ in at least three 
positions. Clearly, then, the words €’ and 7’ likewise differ in at least 
three positions and, consequently, d(é’, 7’) = 3. 

Case 2. Let the words é and y differ in two positions. Then the words 
€ and y differ in two significant positions—say positions | and I’. 
Since / # I’, the binary representation of l differs from that of /’ in at 
least one place. Let i be the place in which the two representations differ 
and, without loss of generality, say 4 = 1 and // = 0. Then l €m, but 
"En. 

Because the words £' and 7’ differ in only two significant positions, 
and since /’ € m, the sum of the significant digits in 7, for the word €’ 
and the sum of digits in the corresponding set for n’ must differ. As the 
sum of the digits in the set of positions 7, must be made even both for 
€’ and for y, the words € and 7’ must differ again in the controlling 
position of the set 7; (in position 2'). Thus, é and 7’ differ in at least 
three positions, and d(é’, y”) > 3. 

Case 3. Let the words £ and 7 differ in exactly one position. Then 
the words £ and 7’ differ in exactly one significant position with the 
number /. This number cannot be a power of two, since numbers of the 
form 2! are used for the controlling positions. Therefore, the number / 
has at least two nonzero binary digits /; = 1 and J, = 1. Consequently, 
position lis in both a, and 7,. Since the sum of the digits in these sets 
must be made even for both £' and 7’, € and 7’ must differ in controlling 
positions 2! and 2°, 
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Table 6.1 


Position Contents of 
No. the Position To Ti Ta Ta Ta 


00001 
00010 
00011 
00100 
00101 
00110 
00111 
01000 
01001 
01010 
01011 
01100 
01101 
01110 
01111 
10000 
10001 
10010 
10011 
10100 
10101 
10110 
10111 
11000 
11001 
11010 
11011 
11100 
11101 
11110 
11111 


Ke & + 
* E Y Y Y * * + 


* 
* * E +* 


e t & 


o-oo-oo0o0-. „-o-. - - Sk or oorooro=oo». 
wer rar rar ar OE 


LL t + Y * Y Y * 


* © x * 


In each case, then, the words €’ and 7’ differ in at least three positions, 
and d(é', y”) > 3. 

And so the set of intelligible messages is an N; set; consequently, one 
can in principle restore distorted messages even if the error occurs only 
in a single digit. 

In the binary case, this process of restoration can be carried out with 
comparative ease, for to restore a word in the binary alphabet, it is 
sufficient to determine the number of the position in which the error 
has occurred and to change the entry in this position from 0 to 1 or 
vice versa. 
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In the code that we are examining, the number of the position with an 
incorrect entry can be ascertained by the following method: After the 
transmission of the message £, during which only one digit can be 
distorted (resulting in the message £*), we check whether the sum of the 
digits is even or odd for each set of positions ”,. In other words, we 
calculate the controlling quantities 


a= HB, 


where œ is equal to the sum modulo 2 of all the symbols in the positions 
of the set m; of the received message é*. 

If all of the a, = 0, then &* is an intelligible message.! If, however, 
some « = 1, then an error must exist in a position / belonging to the set 
7r;, that is, in a position where the ¡th binary digit is equal to 1. Con- 
versely, for each a, = 0, no error occurs in any position belonging to 
the set 7, (since two errors that cancel each other out are extremely 
improbable). 

Thus, the controlling quantities a, are just the binary digits in the 
expansion of the number of the position in which the error has occurred; 
that is, 


l= 2" + Oy 12°73 + ... + &2 + Ag >» (6.6) 


So the controlling quantities of the received message completely 
determine / and enable us to restore the correct message €. 

Let us take, for example, the word €’ written in table 6 and distort it 
in the nineteenth position. 

We obtain the word 


é* = 1001110010011011010101100010010 ; 
carrying out our test, we find 
«=l, a =O, a = 0, «=l, a =l, 


that is, 7 = 10011, the binary representation of the decimal number 19. 
Changing 0 to | in the nineteenth position of the word £*, we obtain 
é’, the word that we started with. 


1. That is, if an error exists, it is an error of at least three positions, a situation 
made impossible by the fact that the transmission time is so short that it is highly 
improbable for more than one error to occur. 


46 Automatic Correction of Errors in Messages 


A simpler code allowing us to correct single mistakes would result if 
the word £ were given by a triple repetition of the word £e E(s, Ag). 
Then if £ and 7 differ in r positions, the corresponding words €’ and 7’ 
would differ in 3r positions. Thus, d(£”, y) > 3, provided that € 4 7’. 
The transmitted word is checked in the following manner. 

A triple of positions with the numbers /, / + s, I + 2s, where 1 < 
I < s, is considered. If the symbols in these positions coincide, the 
corresponding symbol is considered to have been given correctly. Since 
the binary language contains only two letters, the symbols in two of 
these three positions must coincide; and so, if only two of these symbols 
coincide, their common meaning is considered to be the correct one 
and is entered in the third position. 

Thus, this code is capable of correcting single errors in each triple of 
corresponding positions. The weakness of the code is its high re- 
dundancy, which is (3s — s)/3s = 2s/3s = 2/3. The redundancy of the 
former code is 


k+1 _ flogs(s+k+1)] +1 
stk+1 s+k+1 , 


where the square brackets denote the greatest integer function (since 
2* <s+k+1 < 2**}); setting the length of the word (s + k + 1) 
equal to n, the redundancy becomes {[log, n] + 1}/n, which, for large n 
(long words) is practically zero. 

Codes that allow the correction of errors in the transmission and 
storage of information are very important in various automatic control 
devices. The last twenty years have seen the appearance of a great 
number of works concerning error-stabilizing codes that allow us to 
correct multiple as well as single errors. 


7 Metrics 
and Norms in 
Multi-dimensional 
Spaces 


In this chapter we examine an n-dimensional vector space R" and 
various distance functions which determine metric spaces. The vector 
space R” serves as a generalization of the concepts of line (R”), plane 
(R?), and three-space (R?) considered in elementary geometry. We can 
arrive at a reasonable definition of the n-dimensional space R” (n-space) 
in the following manner. 

We consider the plane with some system of Cartesian coordinates. 
Each point M on the plane is uniquely defined by a pair of coordinates 
(x, y), where x e R, ye R (here R denotes the set of real numbers). 

To each point M there corresponds in a one-to-one manner a vector 
joining the origin of the coordinate system to that point (see fig. 7.1). 
Thus, there exists a one-to-one correspondence between any two of the 
following objects: 

The point M > the vector OM <-> the 
pair of coordinates (x,y). Conse- 
quently, we may think of the plane 

M (xy) interchangeably as a set of points, a set 
of vectors, or a set of ordered pairs 
(x, y) of real numbers. Analogously, we 
can consider three-space as a set of 
0 x ordered triples (x, y, z) of real numbers. 
Fig. 7.1 Our desired definition (of n-space) 

suggests itself. 

By a vector in n-space (IR”) we mean an ordered n-tuple of real 
numbers 


y 


4 = (X1, X2,- - -3 Xn) - 


47 
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The numbers x, X2,..., Xx, are called the coordinates of the vector £. 
The set of all such vectors is the n-dimensional vector space R”. 
Clearly, the vector space R? is ordinary three-space; the vector space 
R? is the plane; and the space R? is the straight line. 
Three operations are defined on vectors in R”: addition, subtraction, 
and multiplication by a scalar (real number). These operations are 
defined as follows. 


The sum of the vectors é = (x, Xo,..., xn) and y = (Yis Ya, ---> Yn) 
is the vector 


C=E+n7=(% + Yis X2 + Voy. + +> Xn + yn), 


the coordinates of which are the sums of the corresponding coordinates 
of ¿and 7. 
Analogously, the difference of these same vectors £ and 7 is the vector 


6=€-—H=(%) — Vis X2 — Ya». - -s Xn — Va); 


whose coordinates are the differences of the corresponding coordinates. 
The product of the scalar a and the vector é = (x, x2,..., Xn) is the 
vector 


p = af = (aX, AX2,..., AX). 


In other words, to multiply a vector by a scalar, we multiply each of 
the coordinates by the scalar. On the plane and in three-space these 
operations have natural geometrical interpretations. For the sake of 
clarity we shall examine two vectors, & and n, in the plane (fig. 7.2). 
From the diagram it is clear that the sum ¢ = ¿+ n is the vector 
formed by the diagonal of the parallelogram determined by the vectors 
¿é and y. This property of vector addition is useful in physical con- 
siderations involving sums of vector quantities such as forces and 
momenta. 


x X1tX9 





Fig. 7.2 
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The difference of the vectors £ and y (fig. 7.3) is the vector directed 
from the end of the vector y to the end of the vector £. 

The product of the positive number a and the vector € is a vector 
having the same direction as £, but of length a times the length of £. 
(Clearly, when a < 1, the length of the vector aé is less than that of the 
vector é.) To multiply the vector € by the negative number a, one must 
multiply it by |a| and then take the vector with the same length but 
opposite direction. All of these cases are pictured in figure 7.4. 





Fig. 7.4 


One can easily verify that the operations on n-dimensional vectors 
defined above satisfy the following properties, which are analogous to 
the properties of the corresponding operations defined on the real 
numbers. Here &,neR" and a, be R. The symbol 0 is used to denote 
both the real number zero and the zero vector (0, 0,..., 0) e R”. When 
Ois written as the sum or difference of vectors, the zero vector is denoted. 
All scalars are written to the left of vectors in a scalar multiplication. 

l.&+n=n+ £ (commutativity), 

2. E+ (m + 9 = (£ + n) + £ (additive associativity), 

3. ¿- £¿=0, 

4. 0 + £ = & where 0 e R", 

5. a(€ + y) = aë + an (distributivity of scalar multiplication over 
vector addition), 


6. (a + b)E = aë + bE (distributivity of scalar multiplication over 
scalar addition), 

7. a(b£) = (ab) (multiplicative associativity), 

8. 0-£ = 0, where OER, 

9. a-0 = 0, where DER", 

10. 1-€ = £. 
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Clearly, two vectors, £ and y, are equal if and only if 6 — y = 0. 

Let us consider some other examples of multidimensional spaces 
that arise naturally in geometry. 

Example 1. The set of all spheres in three-space. Each sphere is 
uniquely determined by an ordered 4-tuple (x, y, z, R), where (x, y, z) 
is the center and R the radius. 

Example 2. The set of all triangles in three-space. Each triangle is 
uniquely determined by an ordered 9-tuple (x1, 1, 21, X2, Ya, Za, Xas Ya) Za), 
where the triple (x;, y;, Zi) gives the coordinates of the ith vertex of the 
triangle (i = 1, 2, or 3). We suggest that the reader convince himself that 
in both of these examples multiplication of all of the coordinates (that 
is, of the corresponding four- or nine-dimensional vector) by the real 
number a is equivalent to performing a similarity transformation with 
the center of similarity at the origin. We further suggest that the reader 
make a more detailed study of various possible metrics in the spaces of 
spheres and triangles. 

Let us now examine various metrics which we can define on R” to 
form a metric space. 

The metric d, (determining the metric space /,(®) is defined by 


aE, n) = VO — Yi)? + (x2 — Yo)? ++ On — Yn)? (11) 


where 
E = (Xis Xos... Xn) and 9 = (Yis Ya" "> Ya) - 


In three-space and in the plane, the metric d, is just the ordinary 
geometric distance function. Properties 1, 2, and 3 are obvious for this 
metric. 

The space /,“ is defined by the metric d, where 


dé, n) = lx, — vil + [x2 — yal ++ [Xn — yal - (7.2) 


In the plane (the space /, defined by the set R? and the metric d,) this 
metric coincides with the metric d, defined in chapter 4. Again, proper- 
ties 1, 2, and 3 are obvious. 

The space C results if we define a metric d„ according to the rule 


dle, n) = max (|xı = Yıl |X = Yal, .. [Xn > Val) > (7.3) 


that is, d..(£, y) is the maximal deviation of corresponding coordinates. 
Properties 1, 2, and 3 are obviously satisfied by this metric also. For the 
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plane R? it coincides with the metric de introduced in chapter 4. 
Let us prove the triangle inequality for the space /,™. 
Let 


E = (X, X... Xn); N= (Vis Yas- -> Yn); € = (Zi, Zo, ---> Zn). 
Then, obviously, 
AE n) = |xı — yil + |x2 — yal ++ [xn — Yal 
= lx, — 2, + 21 — Wil + |x2 — Zo + Zo — yə| +--+- 
+ Xn — Zn + Zn — Val 
< [x1 — z,| + [za — yıl + [co — Zo| + [Zo — yal tee: 
+ |Xn — Zal + [Zn — yn 


= dé, ¿) + d,(£, 7) ’ 


and the triangle inequality holds. 

For the space C™ with the metric da, the triangle inequality is proved 
as follows. Let |x, — y,| be the largest of the differences of correspond- 
ing coordinates; that is, 


dote, 7) = Max (|x = yal, [2 = Yal, sey [Xp = Yal) 
= Xx — Vel = [Xtc — Ze + Ze — Yel 
< |x — Zel + |2% — Yel - (7.4) 
It is obvious that 
[Xe — Z| < max (|xı — zıl, lx2 — Za], . - - , [xn — Znal) = dolé, | 


|Z — Vel < max (|Z, — yal, |Z. = yal, e.’ \Zn — Yal) = delt, n). 
(7.5) 


Combining (7.4) and (7.5), we get the desired relation 
do(é, y) < dalé, E) + dolk, 7). 


A general class of metric spaces over R” is obtained if we introduce a 
metric d, defined by the formula 


dle, y) = V (x1 — Vi)? + (o — Yo)? + c+ (Xn — Ya), 


where p > 1; the space obtained in this way is called /,™. 

Properties 1, 2, and 3 can be verified in this case exactly as before. 
The triangle inequality is derived from Minkowski’s inequality (see the 
footnote on page 21): 


WV (a, — by)? + (a2 — ba)? ++ (an — br) 
< Way? + ao? +: + ay? + Vb? + bo? +0 + br. 
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It is not difficult to see that for p = 1 and p = 2 the spaces /,™ and 
/, defined above are obtained. As p > œ the metric d, approaches the 
metric d.; that is, /.“% = C™. The reader can easily verify this by 
generalizing the analogous argument in chapter 4. 

A more difficult exercise is to show that the sphere of radius r in the 
space /, is an octahedron (fig. 7.5), whereas in the space C® it is a cube 
(fig. 7.6). 








Fig. 7.5 Fig. 7.6 


The spaces /,™ over R”, like the corresponding spaces /, in the plane 
(R?), are called Minkowski spaces. These spaces can be generalized very 
naturally to an infinite-dimensional vector space whose elements are 
vectors with an infinite number of coordinates. 

A more general class of metrics on R” can be defined with the 
introduction of the concept of convex body. 


y Let us introduce several new defini- 
tions. We shall interpret a vector éin R” 

Y, as the point corresponding to the 
terminal point of that vector when the 

V initial point is placed at the origin. We 


say that a subset V of R” is convex if for 
each pair of vectors ¿e Vand ny € V, all 
vectors of the form aé + (I — aj, 
where a is an arbitrary number between 
” zero and one (0 <a < 1), are con- 
Fig. 7.7 tained in V. Geometrically (in R? and 
R?), this means that the entire line 
segment joining any two points of V is contained in V. 
A subset V < R” is bounded if there exists a positive number K such 
that for any É = (x, X2,..., x) € Y, 


lx] < K; [xa] < K;---|x,| < K. 
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Figure 7.7 pictures a convex but unbounded subset of R?; the subset 
pictured in figure 7.8 is both convex and bounded. 

A vector € belonging to a subset V of R” is said to be an interior 
point of V if for each vector y e R”, there exists a positive number a such 
that & + ane V. In other words, if we move in an arbitrary direction 
from the end of the vector £, then we remain for some time in the set V. 

If V is a flat surface in R®, then V has no interior points. In fact (see 
fig. 7.9), if &e V and the vector y is perpendicular to the plane in which 
V lies, then for any positive a the vector € + an lies outside this plane 
and, in particular, outside the set V. 

A subset V of R® is said to be full-dimensional if V has an interior 
point. A convex, bounded, full-dimensional subset Y of R” will be 
called a convex body.* 





Fig. 7.8 Fig. 7.9 


Let us now consider a convex body V symmetric with respect to the 
origin (that is, if £ e V, then — ¿e V) in which the origin is an interior 
point. 

This convex body can be used to define a metric dy on R”. Suppose 
EER", ne R*; let [ = & — y. Since the origin (0, 0,..., 0) is an 
interior point of V, there exists a positive number a for which af e V. It 
is easy to show that since V is bounded, there is an upper bound to 
{a | al € V}. In other words, there is a lower bound to {l/a | af e V}. 

At this point, we introduce the concept of greatest lower bound. If 
A < R is bounded below, then the greatest lower bound of A (written 
inf a) is the unique number 6 for which the following conditions are 
acA 


satisfied: 2 


1. We leave it to the reader to show that a convex body must have infinitely 
many interior points. 

2. The question of the uniqueness and existence of the greatest lower bound is 
involved and need not concern us here. For a full discussion, see Walter Rudin, 
Principles of Mathematical Analysis (New York: McGraw-Hill, 1964). 
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1. If ae A, then b < a; 

2. If cis a lower bound for A, then c < b. 

We are now in a position to define the distance d,(£, y) as the greatest 
lower bound of (1/a | a > 0, al e V} (recall that £ = € — n), that is, 


ae, m) = inf 2. (7.6) 
alev a 

For ¢ = 0, that is, when the vectors £ and y are equal, any ais admitted 
(as we always have af = 0 € V), and the greatest lower bound in (7.6) 
is zero. When € and y do not coincide, the vector £ # O and the per- 
missible values of a are bounded from above by some positive number 
A. Therefore, the values of 1/a are bounded below by 1/4, so d,(£, 7) = 
inf 1/a > 0. In other words, if £ # 7, dy(£, y) is strictly greater than 


alev 
Zero. 

Because the convex body V is symmetric with respect to the origin, 
ate V if and only if —a€=a(—QeV. Since —f =7 — é, the 
symmetry of the metric is shown; that is, 


dy(£, 7) = dy(n, é) . 


Thus, properties 1, 2, and 3 (page 12) hold for the metric dy. Only the 
triangle inequality remains to be shown. 

Let £, 7, and £ be vectors in R”. We choose two positive numbers a 
and b such that a(£ — De V and b(£ — n) €e V. Let us denote by a the 
quantity 


1 
a _ b 
ST jTarb 
a’ b 


Clearly,0 <a < land1 — a = a/(a + b). Because V is convex, the 
vector 

= ala(¿ — £)] + (1 — a) [BE — v) (7.7) 

is also contained in V. Transforming expression (7.7) by substituting the 

values of «and 1 — «and using the properties of operations on vectors, 
we get 


b a 
Ar te Ir, gem 


ab ab 


per Se -d-cE- 0, (78) 








where c = abj(a + b). 
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Since the vector 6 = c(€ — n) € V, the number 


Il a+b 1 
eb bt 





must be at least as great as the distance d,(é, y) (by the definition of 
d(é,7) as inf l/a): 
a&-meV 


aE, n) < i (7.9) 


Sle 


alin 


However, by the definition of the numbers a and b, the quantities l/a 
and l/b can be made arbitrarily close to the distances dy(é, &) and 
dy(£, n), respectively, since d,(£, £) is the greatest lower bound of the 
l/a and dy(Z, y) is the greatest lower bound of the 1/b. And since the 
inequality is preserved in the limiting process, (7.9) yields the desired 
inequality 


dle, n) < dle, 2) + aE, n) - (7.10) 


It is possible to develop the definition of the metric d, in several other 
ways. 
The norm of the vector é is the quantity 


El» = d(& 0) = inf > (7.11) 


It is clear that the distance d,(£, 7) defined above is just the norm of 
the difference of the vectors € and 7, that is, 


de, 7) = |é — alv- (7.12) 


It can be shown (and we leave the proof to the reader) that this norm 
satisfies the following properties for ¿e R”, y eR”,aekR: 

1. llély > 0; 

2. ||élly = 0 if and only if £ = 0; 

3. lagly = lal él; 

4. JE + aly < lléllv + [ally 

It is possible to arrive at the concept of norm by a more abstract 
route. We define a norm on the n-dimensional vector space R” as a 
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function U: R” —> R (where |||] denotes U(£)) defined for each ¿e R” 
in such a way that the following properties are satisfied (here ée R’, 


7 € R*,aeR): 
1. [él = 0; 
2. JE] = O if and only if € = 0; 
3. [lag] = lal |El; 


4. € + 9] < él + nl. 

The vector space R” with a norm defined on it is called an n-dimen- 
sional Minkowski space.* It can be shown that every norm can be 
defined by some symmetric convex body V. To verify this assertion, we 
consider the set Y consisting of all vectors € for which |/é]| < 1. We 
shall first show that V is a symmetric convex body in R”. 

To show that V is convex, let ée V and ne V, and let 0 < a < 1. 
Then, by properties 3 and 4 of norms, 


lag + (1 — doll < llaél + IA — dll = allél + A — ala. 
Since ||é]| < 1 and |n] < 1, we have 
lag+ 01 -am| <a+ (0-a) =1, 


that is, the vector aé + (1 — a)n also belongs to the set V; thus, V is 
convex. 

Second, we must show that the origin is an interior point of the set V. 
If € is an arbitrary nonzero vector, we obtain upon setting a = 1/|| ¿|| 
that 


jae] = alél = gy lel = 1; 


that is, at e V. If £ = 0, then for any positive real number a, age V, 
since by property 2, |aé|| = |0| = 0 < 1. 

The symmetry of the set V follows from property 3; if £e V, |-£]|| = 
K-D = |-1]]E] = él] < 1. So if ¿e V, then -£eV. 

The proof of the boundedness of the set V is somewhat more cumber- 
some, and so we shall omit it. 

Since the set V is a convex body, it defines a norm n, denoted by 
n(é) = |||. It still remains to be shown that this norm coincides with 
the one chosen at the beginning of the proof. Let a be an arbitrary 
positive number for which at e V. This means that llaé|| < 1, which 


3. In honor of the great mathematician H. Minkowski, one of the creators of 
the theory of relativity. 
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implies that al él] < 1, or l/a > |é]. So if aé e V, lja > |£l, and the 
least such 1/a is obtained by setting l/a strictly equal to ||]; that is, by 
setting a = 1/||€||. In other words, the greatest lower bound of the 1/a 
is JE], or 


inf 1 = Je]. (7.13) 


atev q 


Comparing equations (7.13) and (7.11), we see that 


él = £llv> (7.14) 


the desired result. 

In R®, the norm defined by a given convex body V has a simple 
physical interpretation. 

Let us suppose that we have some anisotropic device which propa- 
gates sound waves at various speeds in different directions, and con- 
sider the case in which the speed of sound in opposite directions is the 
same. 

We now choose a unit of speed (such as miles per hour) and construct 
in each direction from the origin a vector whose length is equal to the 
speed of sound in that direction. We make the further assumption that 
the terminal points of these vectors bound a convex body V. Clearly, V 
is bounded, symmetric with respect to the origin, and contains at least 
one interior point, the origin. So there is a norm ||/é||y and a metric dy 
defined by 


dem) = E — alv 


We leave it to the reader to verify that the distance dy(£, y) is numeri- 
cally equal to the time required for a sound wave to travel from the 
end of the vector £ to the end of the vector 7 along the straight line 
connecting them. 

In addition to the finite-dimensional Minkowski spaces, one can 
consider their infinite-dimensional analogs—the so-called Banach 
spaces.* In general, a Banach space is a vector space on which a norm 
can be defined (that is, a space satisfying all of the properties listed on 
page 49 along with a norm possessing all of the properties listed on 
page 56). 


4. In honor of the Polish mathematician S. Banach (1892-1945), one of the 
founders of functional analysis—an important branch of modern mathematics. 
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We can construct an example of an infinite-dimensional Banach space 
in the following way: Let C({0, 1]) be the set of all continuous functions 
on the closed interval [0, 1] = {£ |0 < r < 1}. The sum of two such 
vectors (in this case, functions) is defined by its operation on a number £; 
that is, f + gis defined by (f + g)(t) = f(t) + g(t) for fand g elements 
of C([0, 1]). Similarly, the scalar product af (for a e R) is defined by 
(af)(t) = alf(t)]. The zero vector is the function 0 defined by O(t) = 0, 
all te [0, 1]. As the norm of a function we take the maximum of the 
absolute values of elements of the range, that is, 


IA = max fo). 


(It can be shown that because of the choice of the domain [0, 1], such a 
maximum necessarily exists.) 

It is easy to show that all of the previously listed vector space and 
norm properties are satisfied by the space C([0, 1]) with the norm 
defined above. 

Banach spaces in which the vectors are functions play an important 
role in modern mathematics. 

The claim that metric spaces whose points are functions are infinite- 
dimensional can in some sense be justified by the following reasoning. 

We partition the closed interval 
[0,1] by drawing vertical lines 
through n of its points (fig. 7.10). 
Now we take the vector € = 
(x1, Xg,..-, Xn) E R” and represent 
its coordinates on these vertical 
lines. The points in the plane 
determined in this way form the 
graph of some function defined on 

Fig. 7.10 the n chosen points. Clearly, as 

n > œ, this set of points ‘“‘con- 

verges”” to the graph of continuous function if we have chosen points in 

R" whose coordinates on adjacent vertical lines become arbitrarily 
close as n — œ. If we define a norm on R” by 





IE = max (Ixil, |x2l, t. |x.]) 


(where € = (x, X2,..-,; Xn) € R”), then this norm “in the limit” (as 
n — œ) becomes the norm defined by 


Metrics and Norms in Multi-dimensional Spaces 59 
[fll = max AÐ, 
Ost<1 


where fe C([0, 1). 

The point is that n, the dimension of the normed space in question, 
increases without bound, indicating that the “limiting’’ space C([0, 1]) 
is infinite-dimensional. 


8 The Smoothing 
of Errors in 
Experimental 
Measurements 


In the measurement of physical quantities, experimental results often 
appear as a sequence (x1, X2,..., Xn) of observed values. 

The quantity itself can be constant or variable. In the latter case, the 
values x1, X2,..., Xn Should vary according to some law; in the former 
case, they should be nearly equal. But in any case, the measured quan- 
tities x1, X2,..., Xx, are subject to error. In other words, there are 
inherent experimental imperfections that hinder the reception of 
information from nature. 

The mathematical problem concerned with the treatment of measure- 
ments is that of the establishment (so far as possible) of the correct 
information. The solution lies in the application of concepts developed 
previously for the automatic correction of errors in discrete messages. 

If the measured quantities can take on arbitrary real values, we can 
consider the n-dimensional vector space R” as our space of information. 
The distance d(£, y) between points of this space can be defined to fit the 
experiment being carried out. But most frequently, a metric d of the 
form 


AE, n) = vi — yr)? ++ On Yn)? (8.1) 


is used, for which the space of information is /,. 

Let N = R” be a subset of this space of information. 

As a “correct” message, we take the vector ne N “closest”? to the 
message £ that is received, that is, a vector y such that 


dE, m) = min dé, 7) (8.2) 


(if such a vector exists). It can be shown that in the interesting cases (for 
example, when the set N consists of all vectors which require the x, to lie 
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along some curve when plotted against time coordinates 4) such a 
minimum exists, and thus so does the vector y. For the metric defined 
by (8.1), this principle is commonly known as the method of least 
squares, a method introduced by the great German mathematician Karl 
Friedrich Gauss. 

Let us examine a concrete example of a subset of theoretically 
possible messages. We suppose that the measured quantity changes 
linearly with respect to time, that is, if y is the measured quantity, 


y=kt+b, 


where k and b are some constants and f is the time.? 
This means that each vector y € N has the form 7 = (Y1, Ya» -- -» Yn) 
where 


yi =ktı+b, 
Ya = kta +b, 
Yn = kt, + bd. 


Let the vector actually obtained by measuring this quantity be equal 


to € = (x1, X2, . . ., Xn). The fundamental condition (8.2) is now written 
as follows: 
F(k, b) = (kt, + b — x1)? + (kta + b — xa)? +++- + (ktn +b xp) 


min. 


In the expression F(k, b), the unknowns are the parameters k and b 
defining the unknown theoretically possible messages; the quantities 
ti, ta... tn and Xi, Xo, . . ., Xn are experimentally known. 

To find the minimum value of the quantity F(x, b), we use a criterion 
from differential calculus: 


“"=-0; 2=0, (8.3) 


1. For the sake of simplicity, we shall assume that the error involved in defining 
moments of time f, is negligible. 
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which, in the given case of a positive quadratic function F(k, b), is 
necessary and sufficient for a minimum. 
Let us calculate the partial derivatives: 


E = lle, + b = x1) ++ Bight + b = xa), 
(8.4) 
OF 
3p 7 lat +b—x,) +---+ kt, +b- x). 
For convenience, we denote 
[22] = t,? + te? +-+ In, 
[t] = ty + to +'.+ th» 
[tx] = txi + tXo +++ + In» 
[x] = x1 +++, 
1]=1+1+.-+1=n. 
The expression (8.4) can then be written in the form 
= = 2[r2]k + 21475 — Alex], 
(8.5) 


= = 2[t]k + 2[1]6 — 2x]. 


Setting these expressions equal to zero in accordance with (8.3), dividing 
by two, and transferring the free terms to the right side, we get the 
fundamental equation of the method of least squares in symbolic form: 


[t?]k + [t]b = [tx] \ 
(8.6) 


(tlk + (1b = [x]. 


Figure 8.1 pictures measured values x1, X2, X3, X4, X5, Xes Xy, Xa, and the 
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Table 8.1 
Xy tı- ti t? 
1 0.20 0.30 0.06 0.09 
2 0.43 0.91 0.39 0.83 
3 0.35 1.50 0.53 2.25 
4 0.52 2.00 1.04 4.00 
5 0.81 2.20 1.78 4.84 
6 0.68 2.62 1.79 6.86 
7 1.15 3.00 3.45 9.00 
8 0.85 3.30 2.81 10.89 
> [x] = 4.79 [z] = 15.83 [tx] = 11.85 [12] = 38.76 


straight line y = kt + b defined according to the method of least 
squares. The figure makes it clear why we speak of the “smoothing” of 
errors. 

Table 8.1 shows the order in which the calculation is carried out. 
The system (8.6) in this case has the form 


38.76k + 15.83b = 11.85, 
15.83k + 8b = 4.79. 


The solution is k = 0.319, b = —0.032. The unknown “message” is 
y = 0.3191 — 0.032. 

Analogously, if the set of theoretically possible messages N consists 
of all parabolic functions of the form y = at? + bt + c, then the funda- 
mental condition (7.11) can be written in the form 


F(a, b, c) = (at? + bti + c — x1)? +. + (at? + bt, + ¢ — xp)? 
= min. 


The minimizing of the functions F(a, b, c) reduces to the solution of a 
system of three linear equations in three unknown parameters a, b, 


and c. 
The method of least squares can also be easily carried out in the case 


of a metric d of the form 
HE y) = a(x, — yi)? + a(x — yo)? +--+ + (A — yn)’, (8.7) 


where £ = (x, Xa, ..., Xa) and y = (Yi, Yo, - - -, Yn) are elements of R", 
and ay, a,..., a, are positive real numbers (weights). Unequal weights 
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must be employed if it is known that separate measurements in the 
experiment are not equally exact. In this case, it is necessary to assign 
smaller weights to less accurate measurements. 

The fundamental principle (8.2) for the smoothing of errors can also 
be applied to the metrics of the spaces C™ and /,. However, in these 
cases, methods of determining the set of theoretically possible messages 
are more complicated. 
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9 A More 
General Definition 
of Distance 


As we have already stated, various generalizations of the notion of 
distance are possible. One of the most radical is used in the theory 
of relativity, where we consider the space-time universe consisting of 
points of the form (x, y, z, t), where x, y, and z are spatial coordinates 
and t is the time coordinate. The distance (space-time interval) between 
two such points is defined by the formula 


AE n) = V ÊU — 1)? (x=) y o yi)? — @- az), (91) 


where c is the speed of light. It is clear that the metric d can assume 
imaginary as well as real values. 

It is also possible to generalize the concept of distance by assuming 
that a function d, satisfying axioms 1, 2, 3, and 4 (chap. 3, page 12), can 
have infinite value.* In this case, however, the space could be partitioned 
into disjoint subsets, each of which would be a metric space in the usual 
sense. Consequently, such a generalization is not very interesting. The 
proof of this fact can be sketched as follows. 

Examining such a “generalized” space E, let us say two elements 
M and N of E are “equivalent” if the distance d(M,N) is finite. Then, 
clearly, each point M is equivalent to itself, and if M is equivalent to N, 
N is equivalent to M (d(N,M) = d(M,N)). If M is equivalent to L and 
L is equivalent to N, then since 


d(M,N) < d(M,L) + d(L,N), 


d(M,N) is finite and M is equivalent to N. Thus the relation of “equiv- 
alence” partitions the space E into “equivalence classes,” each of 
which is an ordinary metric space with a finite distance function. 

1. More precisely, the value +00, 
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What this example points out is that it is no simple matter to come up 
with a meaningful generalization of an abstract mathematical concept. 
In every case, such a generalization must come from a deep study of the 
mathematical objects involved and not simply from a formal manipula- 
tion of axioms. The abortive attempt described above notwithstanding, 
there do exist a number of meaningful generalizations of the concept of 
metric space, one of which we shall study further. By giving up the 
axiom of symmetry (axiom 1), we obtain a class of spaces which is 
connected with some interesting mathematical objects. 

We shall define a generalized metric space to be a set E and a function 
d: E x E—R (meaning that d has as its domain the Cartesian product 
of E with itself and as its range the real numbers) with the following 
properties (here é, y, and ¢ are elements of E): 

1. d(é,n) > 0. 


2. The double equality d(é, n) = d(n, £) = 0 is satisfied if and only if 
f=. 


3. dE, n) < dé, 0 + dE, n). 


Clearly, any ordinary distance 
function satisfies these conditions. 
However, a function nonsymmetric 
with respect to its arguments can also 
satisfy axioms 1, 2, and 3. In fact, we 
introduced such a nonsymmetric 
distance function at the end of 
chapter 4 in connection with the 
definition of distance as the minimal 
time required for travel from one 

Fig. 9.1 point to another. Since a journey in 

the opposite direction may require 

more time, this metric is, in general, not symmetric, but the triangle 
inequality (and axioms 1 and 2) are easily verified. 

Another nonsymmetric distance function is definable on the space 
consisting of the ten vertices of the diagram in figure 9.1. 

The distance d(M,, M,) between the points M, and M; is defined as 
the minimal number of line segments passing against the arrows in a 
path joining M, and M,. 

For example, 





AM, Mio) = 4; dM, Mı) = 0; 


ad(M3, My) = 3; d(M;, Ma) = 1; and so on. 
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Clearly, d(M,, M,) > 0. The condition d(M,, M) = dM; M) = 0 
(both distances being zero) means that it is possible to join the point M, 
to the point M; and to join M, to M, by means of line segments directed 
with the arrows; that is, M, and M, are vertices of a closed path on 
which all arrows go the same way. As there are no such loops in figure 
9.1, the equality d(M,, M,) = d(M,, M,) = 0 implies that the points M, 
and M, coincide. Thus, conditions | and 2 hold for this nonsymmetric 
distance function. 

The triangle inequality can be verified by the following argument. We 
examine a path with the minimal number of segments directed against 
the arrows joining M, to M, and an analogous path from M, to Mj. 
Joining these paths, we obtain a path from M; to M, with the number of 
line segments directed against the arrows equal to d(M,, M,) + 
d(M,,, M;). Since in the “shortest” path from M; to M,, the number of 
such segments is at least as small, 


AM, M) < AM, My) + AM, M) . (9.2) 
In this example it is possible to define a new metric d* by the rule 
d*(M,, M,) = d(M,, M) + d(M, M) . (9.3) 


Clearly d*(M,, M,) possesses the properties of an ordinary metric. 
The analogous proposition is true in an arbitrary generalized metric 
space. 


THEOREM If (S, d) is a generalized metric space and d*: S x S>R 
is defined by 


d*(é,n) = d(é,n) + dQ, £), (9.4) 


then (S, d*) is a metric space in the ordinary sense. 


Proof. The symmetry of the metric d* follows because the right side 
of (9.4) does not change upon interchanging £ and y. The equality 
d*(£,n) = 0 is equivalent to the double equality d(£, y) = d(n, €) = 0 
(since d takes on no negative values) and is therefore equivalent to the 
statement £ = 7. Finally, since 


dé, n) < dé, Y) + dE, y) 
and 


do, < dy, 9 + dE, £), 


68 A More General Definition of Distance 
we get 
dlé, n) + dm, €) < ACE, E) + dí£, E) + dE, n) + dm, Y), 
or 
d*(E,) < dE £) + dm), 


and so the triangle inequality holds for the metric d*. 

Another interesting example of a generalized metric space can be 
obtained using the important concept of a partially ordered set. 

A set S is said to be partially ordered if for some ordered pairs of 
points (M, N)e S x S, the relation M( N (read M precedes N) is 
defined and satisfies the following axioms: 

1. If M(N and NÇ M, then M = N (antisymmetry). 

2. If M(N and NCL, then M(L (transitivity). 

An example of a partially ordered set is the set of vertices in figure 9.1. 
We set M, ( M; if there is a path joining M; to M, which moves only in 
the direction of the arrows. For example, Ms ( Mio, Mı ( Mz, Ma ( Mg, 
M,(M.. 

A second example is obtained by considering ( to denote the relation 
“ <” on the real line; that is, x ( y if and only if x < y. In this case, it 
is clear that for each pair of distinct points x and y either x (y or y (x 
is valid. A set with such an ordering (in which for each pair of distinct 
points x and y, either x ( y or y ( x) is said to be linearly ordered by (. 

In any partially ordered set it is possible to introduce the notion of an 
immediate predecessor. 

A point M is said to immediately precede a point N (and we write 
MON) if M(N and there is no third point L different from M and N 
lying “between” M and N; that is, such that M(L(N. 

For example, in figure 9.1, M O Ma, M2 O M4, Mi O Ma, and so on. 

No real number has an immediate predecessor, for if x ( y, then 
x C(x + y)/2 Cy, since x < (x + y)/2 < y. 

We now consider a finite partially ordered set E and suppose that that 
set has the property of connectedness; that is, for each pair of points M 
and N in E there exists a sequence of points M = Li, Iz,...,L, = N 
such that for each i with | <i < k — 1, either L; ( 2,4, or Ly, (£;. 
For the points M, and Ms in figure 9.1, for example, we can construct 
such a sequence as follows: 


Lı = M3; L =M,; L = Mo; L, = Ma, 
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since 


Mı (M3; M: (M, ; Ma ( M3. 


We leave it to the reader to check that the set of vertices in figure 9.1 is 
connected. 

The set of points in figure 9.2 is not 

Ma connected, since for the points Ma and 


M 
M 2 . 
l M3 such a connecting sequence does 
not exist. However, the subsets E, = 
(M,, M2, Ma, Ma} and E, = (Ms, Me, 


M5 Ms M,, Mg} are connected. One can easily 
M6 Mg verify that any finite partially ordered 
set can be partitioned into disjoint 
connected subsets. 
M7 We are now in a position to introduce 


Fig. 9.2 into an arbitrary partially ordered set E 

a metric d defined according to the 

following rule. We first define a path from a point M to a point N to be 

a chain of points M = Li, La ..., Ly = N such that for each 7 with 

l1<i<k —1, either LLO Li+ı or L41 ÖL, We then define the 

distance d(M, N) to be the length of the shortest path from M to N—the 

length of a path being defined as the number of integers i such that 

I <i<k-—l and Li: O Li (that is, the number of steps “against 
the arrows”). 

The nonnegativity of the distance d(M, N) follows from the definition. 
For the proof of the second axiom of distance we note that for distinct 
points M and N, the condition d(M, N) = 0 implies that there exists a 
chain of points M =L,,L>,...,Ly = N such that L; O £;,,, and, in 
particular, that £;¡( £;,, for each i. But then, by the second axiom 
(transitivity) for partially ordered sets, we get that M ( N. Analogously, 
from the condition d(N, M) = 0, it follows that N(M. Thus, if 
d(M, N) = d(N, M) = Ois satisfied, M( N and N ( M; so, by the first 
axiom (antisymmetry) for partially ordered sets, M = N. Conversely, 
if M = N, then the length d(M, N) of the shortest path from M to N 
and the length d(N, M) of the shortest path from N to M are equal to 
zero. So the metric d satisfies the second condition for a generalized 
metric. 

For the proof of the third axiom (the triangle inequality) we employ 
a familiar method. Taking a shortest path M = Li, La... L= Q 
from M to Q and a shortest path Q = Ly, Lr+1s- - -> Leap = N from 
Q to N, we form a path M = 1, Lo,..., Ly = Q = Lu Lust >>> Lisp 
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= Nfrom M to N. The total number of pairs of adjacent points in this 
chain for which L,,; © Li; is equal to the sum of the distances d(M, Q) 
and d(O, N). Clearly, the number of such pairs in the shortest path 
from M to N can only be smaller: 


d(M, N) < d(M, 0) + d(Q, N). (9.5) 


And we have shown that for any finite connected partially ordered set 
E we can define a metric d so that (E, d) forms a metric space. 

As an exercise, we suggest that the reader prove that for every pair of 
points M and N where M ( N, the distance d(M, N) = 0. 

In a sense, the converse assertion is also true. In any generalized 
metric space (E, d) it is possible to introduce a partial ordering ( defined 
byM(NifdM,N) = 0. 

To prove this, we must show that both axioms for a partial ordering 
are satisfied. If M( N and N ( M, then d(M, N) = d(N, M) = 0, and 
by the second condition for a generalized metric, M = N. The first 
axiom for partially ordered sets is thus proved. 

Now let M(L and L(N. Then d(M, L) = 0 and d(L, N) = 0. By 
the triangle inequality 


d(M, N) < d(M, L) + d(L, N) = 0; 


but by nonnegativity, O < d(M, N), and so 0 < d(M, N) < 0; that is, 
AM, N) =0 and M (N. 

Thus, we have shown that if M( L and L(N, then M (N. 

The partially ordered set which we obtain in this way need not be 
connected. 

Examining, for example, the partially ordered set E of the points in 
figure 9.2, one can define between pairs of points from the subset 
E, = {M,, M2, Mz, M,} a distance by means of the shortest path. The 
same can be done for the subset E, = {M;, Me, Mr, Me}. We further 
define the distance between a point M; € E, and a point M, € E; by 


d(M,, M) = d(M,, M) = 100. (9.6) 


It is easy to verify that we get a generalized metric space in which the 
equality d(M, N) = Ois equivalent to the relation M ( N in the partially 
ordered set E. As we have already remarked, however, E is not a 
connected set. 

It is possible, however, to introduce the notion of a connected 
generalized metric space. The space (E, d) is said to be connected if for 
any pair of (not necessarily distinct) points M and N of E there exists a 
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chain of points M =£,,Lo,..., Lg = N such that for each adjacent 
pair of points L; and Z,,, either d(L;, L441) = 0 or dL,,L) = 0. 
We leave it to the reader to verify that any connected generalized metric 
space corresponds to a connected partially ordered set. 

A finite partially ordered set (and the corresponding metric space) 
can be represented geometrically in a very simple manner. We depict the 
elements of the partially ordered set as points in three-dimensional space 
denoted by the same letters as the corresponding elements. We join each 
pair of points M and N for which MO N by a line segment directed 
from N to M, indicating the direction by an arrow. The geometric 
figure obtained, consisting of the points (vertices) and the directed line 
segments joining them, is called a graph. We have already seen examples 
of graphs in figures 9.1 and 9.2. 

It is easy to see that if M ( N, it is possible to travel from N to M by 
means of a path that moves only in the direction of the arrows. 

Metric spaces with nonsymmetric distance functions are especially 
important in the concept of a discrete topological space. 

With this we conclude our study of the concept of distance. We have 
established that this concept in its many different aspects is connected 
not only with problems in pure mathematics, but with such practical 
problems as the construction of error-stabilizing codes. This multiplicity 
of applications and the complicated logical connections are characteris- 
tic of other essential mathematical concepts as well. The principal 
motivation for the creation of such concepts lies in the possibility of 
connections and analogies to seemingly unrelated fields and in the need 
to discover the hidden principles upon which mathematical properties 
depend. 
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